-
Bug
-
Resolution: Done
-
Critical
-
None
-
4.14.z
-
Important
-
No
-
SDN Sprint 254
-
1
-
False
-
-
Release Note Not Required
-
In Progress
-
-
-
-
-
-
-
05/21; testing 4.13->4.14 upgrade; cu states "production upgrade would harm the availability of production" Likely effect any upgrade w/ipsec to ovn-ic. consider known issue in RN; subsequent upgrade should be ok. Score elevated by px properties
-
-
-
Description of problem:
When upgrading OpenShift from 4.13 to 4.14, the CNO migrates ovn-ipsec to its new architecture in 4.14 and then just waits until everything else is migrated. This can take a while in a small cluster like the test I did, but mid to large clusters this may mean that ovn-ipsec will stay unavailable with initContainers in crashloopbackoff [1]. This can be concerning to customers and they expect that such disruption to be as little as possible, even though they understand that some is normal. Looking at the daemonsets the ovn-ipsec pods will need ovnkube-node to configure the node properly in order for the initContainer to create the key pairs and then have ovn-ipsec started. In normal upgrades between errata releases, I understand why we upgrade ipsec first, but in this case I don't think CNO making this upgrade first is not the best procedure. Important thing I noticed in the upgrade frpom 4.13.38 to 4.14.20, is that there was an intermediate rollout on ovnkube-node and ovnkube-master daemonsets before the actual migration started. During all this time the new ovn-ipsec was already deployed and pods all crashing. [1] + echo '2024-05-09T11:22:03+00:00 - ERROR - /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem not found' 2024-05-09T11:22:03+00:00 - ERROR - /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem not found + return 1 /bin/bash: line 16: return: can only `return' from a function or sourced script
Version-Release number of selected component (if applicable):
OCP 4.14
How reproducible:
Often
Steps to Reproduce:
1. Install OCP on 4.13 2. Upgrade to 4.14 3. Monitor the OVN pods
Actual results:
ovn-ipsec pods will stay in CrashLoopBackOff until the rest of the entire OVN stack is migrated and running
Expected results:
Minimum disruption problem during the upgrade
- is blocked by
-
SDN-4871 Impact [OVN-IPSEC] During upgrade from 4.13 to 4.14 ovn-ipsec will stay in error state until all OVN stack is migrate
- Closed
- is cloned by
-
OCPBUGS-34883 [OVN-IPSEC] During upgrade from 4.13 to 4.14 ovn-ipsec will stay in error state until all OVN stack is migrate
- Closed
- is depended on by
-
OCPBUGS-34883 [OVN-IPSEC] During upgrade from 4.13 to 4.14 ovn-ipsec will stay in error state until all OVN stack is migrate
- Closed