Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-48216

Idempotency issue on removing CVO resources in init container

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • 4.18
    • HyperShift / ARO
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      While working on ARO-13685 I (accidentally) crashed the CVO payload init containers. 
      
      I found that the removal logic based on plain "rm" is not idempotent, so if any of the init containers crash mid-way, the restart will never be able to succeed.
      
      The fix is to use "rm -f" in all places instead.

      Version-Release number of selected component (if applicable):

      4.18 / main, but existed in prior versions    

      How reproducible:

      always    

      Steps to Reproduce:

          1. inject a crash in the bootstrap init container https://github.com/openshift/hypershift/blob/99c34c1b6904448fb065cd65c7c12545f04fb7c9/control-plane-operator/controllers/hostedcontrolplane/cvo/reconcile.go#L353 
      
          2. the restarting previous init container "prepare-payload" will crash loop on "rm" not succeeding as the previous invocation already deleted all manifests
        
          

      Actual results:

      the prepare-payload init container will crash loop forever, preventing the container CVO from running

      Expected results:

      a crashing init container should be able to restart gracefully without getting stuck on file removal and eventually run the CVO container

      Additional info:

      based off the work in https://github.com/openshift/hypershift/pull/5315    

              sjenning Seth Jennings
              tjungblu@redhat.com Thomas Jungblut
              Jie Zhao Jie Zhao
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: