-
Spike
-
Resolution: Done
-
Critical
-
None
-
None
-
None
-
False
-
None
-
False
-
---
-
-
-
0
-
0
Which 4.y.z to 4.y'.z' updates increase vulnerability?
Any -> 4.12.(1,2,3,4)
Which types of clusters?
Any OVN cluster that was during its lifetime migrated to dualstack (but not ones that were installed as such)
If there is a cluster that has 2 ClusterNetworks and 2 ServiceNetworks configured but the Node CR has only 1 InternalIP in its Status.Addresses field, this is such a cluster. In pre-4.13 times the only way to end up with such a cluster is to install as single-stack and convert to dual-stack somewhere along the way.
What is the impact? Is it serious enough to warrant removing update recommendations?
OVN-K8s will be crash-looping. If you start debugging manually and reboot the node in such a state, you may not be able to schedule any Pods at all afterwards, they will be stuck in "ContainerCreating" status.
How involved is remediation?
There exists quite a heavy workaround which is to disable MCO, modify kubelet systemd unit definition manually, restart the system, upgrade to fixed version, revert modification of kubelet and then enable MCO; we did it once with Verizon as they escalated the upgrade to 4.12.(1,2,3,4) issue but it takes really huge engineering effort to manually fix such a cluster. bnemec@redhat.com can shed more light if needed.
There is also https://access.redhat.com/solutions/7014904 which describes the problem (but does not provide the workaround)
Is this a regression?
Yes
- blocks
-
OCPBUGS-6040 ovnkube node pod crashed after converting to a dual-stack cluster network
- Closed
- links to