-
Bug
-
Resolution: Done-Errata
-
Critical
-
None
-
4.15.z, 4.16.z
Description of problem:
PrometheusOperatorRejectedResources alert starts to stream after compliance operator is upgraded from 1.5.0 to 1.5.1.
Below log is observed in prometheus-operator pod available in openshift-monitoring project:
2024-09-03T07:48:11.880185599+07:00 level=warn ts=2024-09-03T00:48:11.880124271Z caller=resource_selector.go:174 component=prometheusoperator msg="skipping servicemonitor" error="failed to get authorization token of type Bearer: failed to get token from secret: key \"token\" in secret \"compliance-operator-dockercfg-hpx5s\" not found" servicemonitor=openshift-compliance/metrics namespace=openshift-monitoring prometheus=k8s
YAML of servicemonitor named "metrics" when Compliance Operator 1.5.0 is used:
spec: endpoints: - port: metrics - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token path: /metrics-co port: metrics-co scheme: https tlsConfig: ca: {} caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt cert: {} serverName: metrics.openshift-compliance.svc namespaceSelector: {} selector: matchLabels: name: compliance-operator
YAML of servicemonitor named "metrics" after upgrading Compliance Operator to 1.5.1:
spec: endpoints: - port: metrics - authorization: credentials: key: token name: compliance-operator-dockercfg-hpx5s type: Bearer path: /metrics-co port: metrics-co scheme: https tlsConfig: ca: {} caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt cert: {} serverName: metrics.openshift-compliance.svc namespaceSelector: {} selector: matchLabels: name: compliance-operator
The secret named compliance-operator-dockercfg-hpx5s contains key named .dockercfg. The authorization section expects a secret which contains token, for eg: secret named like {}compliance-operator-token- which hold the actual "token{*}".
Version-Release number of selected component (if applicable):
Compliance Operator 1.5.1
How reproducible:
100%
Steps to Reproduce:
- Install Compliance operator 1.5.0 and wait for its complete installation.
- Upgrade Compliance operator to 1.5.1.
- Wait for couple of minutes, and navigate to "Observe > Alerts". The alert PrometheusOperatorRejectedResources will be seen in Pending state and then it will fire in few more minutes.
Actual results:
Right after upgrading Compliance Operator t 1.5.1 version, alert PrometheusOperatorRejectedResources is fired because the servicemonitor conatins incorrect token information.
Expected results:
The servicemonitor metrics should contain the secret which holds the actual token. The standard naming convention is compliance-operator-token-*
Additional info:
The issue can be fixed by updating the servicemonitor to use compliance-operator-token-*.
- links to
-
RHBA-2024:138712 OpenShift Compliance Operator 1.6.0
- mentioned on
Since the problem described in this issue should be resolved in a recent advisory, it has been closed.
For information on the advisory (OpenShift Compliance Operator 1.6.0), and where to find the updated files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2024:6761