-
Bug
-
Resolution: Done-Errata
-
Critical
-
None
-
4.15.z, 4.16.z
Description of problem:
PrometheusOperatorRejectedResources alert starts to stream after compliance operator is upgraded from 1.5.0 to 1.5.1.
Below log is observed in prometheus-operator pod available in openshift-monitoring project:
2024-09-03T07:48:11.880185599+07:00 level=warn ts=2024-09-03T00:48:11.880124271Z caller=resource_selector.go:174 component=prometheusoperator msg="skipping servicemonitor" error="failed to get authorization token of type Bearer: failed to get token from secret: key \"token\" in secret \"compliance-operator-dockercfg-hpx5s\" not found" servicemonitor=openshift-compliance/metrics namespace=openshift-monitoring prometheus=k8s
YAML of servicemonitor named "metrics" when Compliance Operator 1.5.0 is used:
spec: endpoints: - port: metrics - bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token path: /metrics-co port: metrics-co scheme: https tlsConfig: ca: {} caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt cert: {} serverName: metrics.openshift-compliance.svc namespaceSelector: {} selector: matchLabels: name: compliance-operator
YAML of servicemonitor named "metrics" after upgrading Compliance Operator to 1.5.1:
spec: endpoints: - port: metrics - authorization: credentials: key: token name: compliance-operator-dockercfg-hpx5s type: Bearer path: /metrics-co port: metrics-co scheme: https tlsConfig: ca: {} caFile: /etc/prometheus/configmaps/serving-certs-ca-bundle/service-ca.crt cert: {} serverName: metrics.openshift-compliance.svc namespaceSelector: {} selector: matchLabels: name: compliance-operator
The secret named compliance-operator-dockercfg-hpx5s contains key named .dockercfg. The authorization section expects a secret which contains token, for eg: secret named like {}compliance-operator-token- which hold the actual "token{*}".
Version-Release number of selected component (if applicable):
Compliance Operator 1.5.1
How reproducible:
100%
Steps to Reproduce:
- Install Compliance operator 1.5.0 and wait for its complete installation.
- Upgrade Compliance operator to 1.5.1.
- Wait for couple of minutes, and navigate to "Observe > Alerts". The alert PrometheusOperatorRejectedResources will be seen in Pending state and then it will fire in few more minutes.
Actual results:
Right after upgrading Compliance Operator t 1.5.1 version, alert PrometheusOperatorRejectedResources is fired because the servicemonitor conatins incorrect token information.
Expected results:
The servicemonitor metrics should contain the secret which holds the actual token. The standard naming convention is compliance-operator-token-*
Additional info:
The issue can be fixed by updating the servicemonitor to use compliance-operator-token-*.
- links to
-
RHBA-2024:138712 OpenShift Compliance Operator 1.6.0
- mentioned on