Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-39417

[CEE.neXT]PrometheusOperatorRejectedResources alert after upgrading compliance operator to 1.5.1

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Critical Critical
    • None
    • 4.15.z, 4.16.z
    • Compliance Operator
    • +
    • Important
    • Yes
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      PrometheusOperatorRejectedResources alert starts to stream after compliance operator is upgraded from 1.5.0 to 1.5.1.

      Below log is observed in prometheus-operator pod available in openshift-monitoring project:

       

      2024-09-03T07:48:11.880185599+07:00 level=warn ts=2024-09-03T00:48:11.880124271Z caller=resource_selector.go:174 component=prometheusoperator msg="skipping servicemonitor" error="failed to get authorization token of type Bearer: failed to get token from secret: key \"token\" in secret \"compliance-operator-dockercfg-hpx5s\" not found" servicemonitor=openshift-compliance/metrics namespace=openshift-monitoring prometheus=k8s 

       

       

      YAML of servicemonitor named "metrics" when Compliance Operator 1.5.0 is used:

       

      spec:
        endpoints:
        - port: metrics
        - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
          path: /metrics-co
          port: metrics-co
          scheme: https
          tlsConfig:
            ca: {}
            caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt
            cert: {}
            serverName: metrics.openshift-compliance.svc
        namespaceSelector: {}
        selector:
          matchLabels:
            name: compliance-operator 

       

       

      YAML of servicemonitor named "metrics" after upgrading Compliance Operator to 1.5.1: 

       

      spec:
        endpoints:
        - port: metrics
        - authorization:
            credentials:
              key: token
              name: compliance-operator-dockercfg-hpx5s
            type: Bearer
          path: /metrics-co
          port: metrics-co
          scheme: https
          tlsConfig:
            ca: {}
            caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt
            cert: {}
            serverName: metrics.openshift-compliance.svc
        namespaceSelector: {}
        selector:
          matchLabels:
            name: compliance-operator 

       

      The secret named compliance-operator-dockercfg-hpx5s contains key named .dockercfg. The authorization section expects a secret which contains token, for eg: secret named like {}compliance-operator-token-  which hold the actual "token{*}".

       

      Version-Release number of selected component (if applicable):

      Compliance Operator 1.5.1

      How reproducible:

      100%

      Steps to Reproduce:

      1. Install Compliance operator 1.5.0 and wait for its complete installation.
      2. Upgrade Compliance operator to 1.5.1.
      3. Wait for couple of minutes, and navigate to "Observe > Alerts". The alert PrometheusOperatorRejectedResources will be seen in Pending state and then it will fire in few more minutes.

      Actual results:

      Right after upgrading Compliance Operator t 1.5.1 version, alert PrometheusOperatorRejectedResources is fired because the servicemonitor conatins incorrect token information.

      Expected results:

      The servicemonitor metrics should contain the secret which holds the actual token. The standard naming convention is compliance-operator-token-*

      Additional info:

      The issue can be fixed by updating the servicemonitor to use compliance-operator-token-*.

            wenshen@redhat.com Vincent Shen
            rhn-support-dgautam Dhruv Gautam
            Xiaojie Yuan Xiaojie Yuan
            Votes:
            3 Vote for this issue
            Watchers:
            17 Start watching this issue

              Created:
              Updated:
              Resolved: