-
Spike
-
Resolution: Done
-
Critical
-
None
-
None
-
False
-
None
-
False
-
-
-
MCO Sprint 257
-
0
-
0
Impact statement for OCPBUGS-38295.
Which 4.y.z to 4.y'.z' updates increase vulnerability?
Updates into 4.13.46 until OCPBUGS-38295 lands a fix.
- 4.13.46's
OCPBUGS-37160made that release tricky for Azure clusters born in 4.(y<12) and 4.12.(z<54). - 4.12.54's
OCPBUGS-30823protects clusters born in 4.12.(z>=54) from this issue. - 4.14.32's
OCPBUGS-36356(recently picked back to 4.13.z withOCPBUGS-38295) protects even born-in-old clusters from the issue.
That leaves 4.13.46 as the only exposed release.
Which types of clusters?
Azure clusters born in 4.(y<12) and 4.12.(z<54).
What is the impact? Is it serious enough to warrant removing update recommendations?
As the machine-config operator reboots nodes into the 4.13.46 configuration, systemd will detect a dependency loop among units and disable a unit. The disabled unit will cause CRO-O and the kubelet to fail to run, and the node will never return to Ready=True healthiness unless the cluster admin can SSH in or use the serial console it to recover the systemd units.
How involved is remediation?
Updating to a release with OCPBUGS-38295 will avoid the issue for other nodes, but SSH recovery or replacement may be the only options for already-impacted nodes.
Is this a regression?
Yes, see the which-updates answer above for the multi-patch exposure story.
- blocks
-
OCPBUGS-37534 4.12 -> 4.13 upgrade using IPI on Azure does not work
- Verified
-
OCPBUGS-38295 kubelet does not start after reboot due to dependency issue
- Closed
- links to