Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-22680

[2.13] config-policy-controller will go into a crash loop if it restarts while uninstalling

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • ACM 2.13.4
    • ACM 2.13.0
    • GRC
    • Quality / Stability / Reliability
    • 1
    • False
    • Hide

      None

      Show
      None
    • False
    • GRC Sprint 2025-16
    • Moderate
    • -
    • None

      Description of problem:

      During the uninstallation of the config-policy-controller addon from a managed cluster (which can happen if the addon is disabled, or if the entire cluster is being removed), the config-policy-controller pod can go into a crash loop backoff, with logs indicating a nil pointer dereference in main.go.

      Usually, the pod will run successfully through the uninstallation process, but if it has to restart for some reason, it will fail to start and the uninstallation will stall not recover on its own. In hosted-mode scenarios, the pod may be more likely to restart, because the hosted cluster kubeconfig may change or be deleted, which triggers a restart. The issue will only occur if there are finalizers on ConfigurationPolicies, which are usually only present if they are using `pruneObjectBehavior`.

      The issue can be worked around by manually removing any finalizers remaining on ConfigurationPolicies - that is the last task of the config-policy-controller. 

      Version-Release number of selected component (if applicable):

      2.14, 2.13, 2.12

      How reproducible:

      Unknown. This require very specific timing to cause a restart of the controller at the right time to trigger this. It seems to be more common in the hosted mode scenario.

      Steps to Reproduce:

      1. Uninstall the config-policy-controller
      2. Delete the config-policy-controller pod after the uninstall has begun

      Actual results:

      The config-policy-controller pod goes into a CrashLoopBackOff status, and never recovers. The uninstallation gets stuck.

      Expected results:

      The uninstallation should succeed without intervention.

      Additional info:

              jkulikau@redhat.com Justin Kulikauskas
              jkulikau@redhat.com Justin Kulikauskas
              Derek Ho Derek Ho
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: