-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.18
Description of problem:
While working on ARO-13685 I (accidentally) crashed the CVO payload init containers. I found that the removal logic based on plain "rm" is not idempotent, so if any of the init containers crash mid-way, the restart will never be able to succeed. The fix is to use "rm -f" in all places instead.
Version-Release number of selected component (if applicable):
4.18 / main, but existed in prior versions
How reproducible:
always
Steps to Reproduce:
1. inject a crash in the bootstrap init container https://github.com/openshift/hypershift/blob/99c34c1b6904448fb065cd65c7c12545f04fb7c9/control-plane-operator/controllers/hostedcontrolplane/cvo/reconcile.go#L353 2. the restarting previous init container "prepare-payload" will crash loop on "rm" not succeeding as the previous invocation already deleted all manifests
Actual results:
the prepare-payload init container will crash loop forever, preventing the container CVO from running
Expected results:
a crashing init container should be able to restart gracefully without getting stuck on file removal and eventually run the CVO container
Additional info:
based off the work in https://github.com/openshift/hypershift/pull/5315