-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.16.0
-
None
-
Critical
-
None
-
Proposed
-
False
-
-
Release Note Not Required
-
In Progress
Description of problem
Spin off of OCPBUGS-30192
The daemon process can exit due to health check failures in 4.16+, after we added apiserver server CA rotation handling. The came with the side effect that if the MCD happens to exit in the middle of the update (e.g. image pull portion), the files/units would have been updated but the OS upgrade would not, blocking the upgrade indefinitely when the new container comes up.
Version-Release number of selected component
4.16
How reproducible
Only in BM CI so far, unsure if other issues contribute to this.
Steps to Reproduce
Get lucky and have api-int DNS break while the machine-config daemon is deploying updated files to disk. Unclear how to reliably trigger this, or distinguish from OCPBUGS-30192 and other failure modes.
Actual results
Expected results
Additional info
- is related to
-
OCPBUGS-33318 Enable no-empty-http-proxy linter on origin
- Closed
- relates to
-
MCO-1154 Clean up MCD goroutine channel handling
- To Do
-
OCPBUGS-25821 cert issues during or after 4.14 to 4.15 upgrade
- Closed
-
OCPBUGS-30192 MCD degraded on content mismatch for resolv-prepender script
- Closed
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update