Uploaded image for project: 'Red Hat Advanced Cluster Management'
  1. Red Hat Advanced Cluster Management
  2. ACM-26284

Clash between policies to configure openshift-observability to run on infra nodes and rhacm observability

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • ACM 2.14.0
    • Observability
    • None
    • None

      Description of problem:

      When deploying a cluster with policies that are expected to configure openshift-monitoring to run in infra nodes and observability is already enabled in RHACM, the policy can cause endless rollbacks of the openshift-monitoring configuration.

      Version-Release number of selected component (if applicable):

      2.14.0
      OCP 4.18 but different versions should also have the same issue

      How reproducible:

      customer environment

      Steps to Reproduce:

      1. set policies to be used on deployment of OCP with RHACM
      2. deploy OCP
      3. ...

      Actual results:

      ```
      2025-10-21T18:30:40.133335362Z 2025-10-21T18:30:40.133Z INFO controllers.ObservabilityAddon.cmoWatcher Detected excessive reconciliations triggered by CMO configurations, potentially resulting from reconciliation conflicts between operators. Degrading the addon status.

      {"request": "openshift-monitoring/cluster-monitoring-config"}

      ```

      Expected results:

      no clash due to observability being enabled on the configuration in place

      Additional info:

      The contents of the configuration policy enforce this

                          enableUserWorkload: true
                          alertmanagerMain:
                            volumeClaimTemplate:
                              spec:
                                storageClassName: samplestorageclass
                                volumeMode: Filesystem
                                resources:
                                  requests:
                                    storage: 5Gi
                            nodeSelector:
                              node-role.kubernetes.io/infra: ""
                          prometheusK8s:
                            volumeClaimTemplate:
                              spec:
                                storageClassName: samplestorageclass
                                volumeMode: Filesystem
                                resources:
                                  requests:
                                    storage: 300Gi
                            nodeSelector:
                              node-role.kubernetes.io/infra: ""
                          prometheusOperator:
                            nodeSelector:
                              node-role.kubernetes.io/infra: ""
                          k8sPrometheusAdapter:
                            nodeSelector:
                              node-role.kubernetes.io/infra: ""
                          kubeStateMetrics:
                            nodeSelector:
                              node-role.kubernetes.io/infra: ""
                          telemeterClient:
                            nodeSelector:
                              node-role.kubernetes.io/infra: ""
                          openshiftStateMetrics:
                            nodeSelector:
                              node-role.kubernetes.io/infra: ""
                          thanosQuerier:
                            nodeSelector:
                              node-role.kubernetes.io/infra: ""
      

      this is enough to cause the issue ; there is another policy in use but only this affects the configmap. prior versions of RHACM are also likely to behave the same way.

              Unassigned Unassigned
              rhn-support-fdewaley Felix Dewaleyne
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: