Description of problem:
When uninstalling the config-policy-controller addon, a pod called `config-policy-controller-uninstall` gets deployed which sets the `policy.open-cluster-management.io/uninstalling` annotation to `"true"`. Because of a fix in ACM-6941 due to a misbehaving OADP version update, the governance-policy-addon-controller started setting the uninstalling annotation to `"false"` when deploying. It was supposed to set this to `"true"` when the addon is being deleted/uninstalled, but the addon framework used by the governance-policy-addon-controller does not propagate the change to the ManifestWork when it is being uninstalled.
This can lead to the uninstall stalling because the ManifestWork from the governance-policy-addon-controller and the config-policy-controller-uninstall pod are fighting each other.
This was noticed in the Service Delivery environment.
Version-Release number of selected component (if applicable):
2.8.1
2.9.0
How reproducible:
Reproducible at times.
Steps to Reproduce:
1. Install the config-policy-controller addon
2. Have at least one ConfigurationPolicy with pruneObjectBehavior set to a value other than None.
3. Delete the config-policy-controller addon
4. If the config-policy-controller-uninstall pod runs and restarts every 5 minutes without completing successfully, the issue has been recreated. You can check that the ManifestWork overwrote the uninstalling annotation.
Actual results:
Uninstalls can stall.
Expected results:
Uninstalls don't stall.
Additional info:
This probably also affects the governance-policy-framework addon.
- is cloned by
-
ACM-9094 [2.9] Configuration Policy controller unexpectedly gets taken out of uninstall mode
- Closed
- links to
-
RHSA-2023:125406 Red Hat Advanced Cluster Management 2.8.5 security and bug fix container updates