Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-39417

[CEE.neXT]PrometheusOperatorRejectedResources alert after upgrading compliance operator to 1.5.1

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • None
    • 4.15.z, 4.16.z
    • Compliance Operator
    • +
    • Important
    • Yes
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      PrometheusOperatorRejectedResources alert starts to stream after compliance operator is upgraded from 1.5.0 to 1.5.1.

      Below log is observed in prometheus-operator pod available in openshift-monitoring project:

       

      2024-09-03T07:48:11.880185599+07:00 level=warn ts=2024-09-03T00:48:11.880124271Z caller=resource_selector.go:174 component=prometheusoperator msg="skipping servicemonitor" error="failed to get authorization token of type Bearer: failed to get token from secret: key \"token\" in secret \"compliance-operator-dockercfg-hpx5s\" not found" servicemonitor=openshift-compliance/metrics namespace=openshift-monitoring prometheus=k8s 

       

       

      YAML of servicemonitor named "metrics" when Compliance Operator 1.5.0 is used:

       

      spec:
        endpoints:
        - port: metrics
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          path: /metrics-co
          port: metrics-co
          scheme: https
          tlsConfig:
            ca: {}
            caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt
            cert: {}
            serverName: metrics.openshift-compliance.svc
        namespaceSelector: {}
        selector:
          matchLabels:
            name: compliance-operator 

       

       

      YAML of servicemonitor named "metrics" after upgrading Compliance Operator to 1.5.1: 

       

      spec:
        endpoints:
        - port: metrics
        - authorization:
            credentials:
              key: token
              name: compliance-operator-dockercfg-hpx5s
            type: Bearer
          path: /metrics-co
          port: metrics-co
          scheme: https
          tlsConfig:
            ca: {}
            caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt
            cert: {}
            serverName: metrics.openshift-compliance.svc
        namespaceSelector: {}
        selector:
          matchLabels:
            name: compliance-operator 

       

      The secret named compliance-operator-dockercfg-hpx5s contains key named .dockercfg. The authorization section expects a secret which contains token, for eg: secret named like {}compliance-operator-token-  which hold the actual "token{*}".

       

      Version-Release number of selected component (if applicable):

      Compliance Operator 1.5.1

      How reproducible:

      100%

      Steps to Reproduce:

      1. Install Compliance operator 1.5.0 and wait for its complete installation.
      2. Upgrade Compliance operator to 1.5.1.
      3. Wait for couple of minutes, and navigate to "Observe > Alerts". The alert PrometheusOperatorRejectedResources will be seen in Pending state and then it will fire in few more minutes.

      Actual results:

      Right after upgrading Compliance Operator t 1.5.1 version, alert PrometheusOperatorRejectedResources is fired because the servicemonitor conatins incorrect token information.

      Expected results:

      The servicemonitor metrics should contain the secret which holds the actual token. The standard naming convention is compliance-operator-token-*

      Additional info:

      The issue can be fixed by updating the servicemonitor to use compliance-operator-token-*.

            [OCPBUGS-39417] [CEE.neXT]PrometheusOperatorRejectedResources alert after upgrading compliance operator to 1.5.1

            Errata Tool added a comment -

            Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

            For information on the advisory (OpenShift Compliance Operator 1.6.0), and where to find the updated files, follow the link below.

            If the solution does not work for you, open a new bug report.
            https://access.redhat.com/errata/RHBA-2024:6761

            Errata Tool added a comment - Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (OpenShift Compliance Operator 1.6.0), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2024:6761

            GitLab CEE Bot added a comment - CPaaS Service Account mentioned this issue in merge request !252 of isc-devel / Openshift Compliance Midstream on branch rhaos-4.12-rhel-8_ upstream _ffa90047fc2e390dc6fedee1b6a0e452 : Updated 2 upstream sources

            I was able to recreate this on a fresh cluster with a new install from CO 1.5.1, without the upgrade case.

             

            Lance Bragstad added a comment - I was able to recreate this on a fresh cluster with a new install from CO 1.5.1, without the upgrade case.  

              wenshen@redhat.com Vincent Shen
              rhn-support-dgautam Dhruv Gautam
              Xiaojie Yuan Xiaojie Yuan
              Votes:
              3 Vote for this issue
              Watchers:
              17 Start watching this issue

                Created:
                Updated:
                Resolved: