-
Bug
-
Resolution: Unresolved
-
Major
-
4.16.z
-
None
Description of problem
In the cluster audit logs, the customer is observing that over a 24h window, there are over 3.6 million GET requests for a certain Secret from "system:serviceaccount:openshift-user-workload-monitoring:prometheus-operator". This roughly translates into 500 GET requests every 30 seconds.
In parallel we can see regular "sync alertmanager" messages in the logs around every 30 seconds, but we are not sure this is related:
level=info ts=2025-05-12T12:49:21.739161413Z caller=operator.go:572 component=alertmanager-controller key=openshift-user-workload-monitoring/user-workload msg="sync alertmanager" level=info ts=2025-05-12T12:49:25.785042318Z caller=operator.go:471 component=thanos-controller key=openshift-user-workload-monitoring/user-workload msg="sync thanos-ruler" level=info ts=2025-05-12T12:50:04.752668355Z caller=operator.go:572 component=alertmanager-controller key=openshift-user-workload-monitoring/user-workload msg="sync alertmanager" level=info ts=2025-05-12T12:50:49.552780038Z caller=operator.go:572 component=alertmanager-controller key=openshift-user-workload-monitoring/user-workload msg="sync alertmanager" level=info ts=2025-05-12T12:51:23.461977084Z caller=operator.go:572 component=alertmanager-controller key=openshift-user-workload-monitoring/user-workload msg="sync alertmanager"
Describe the impact to you or the business
There is significant load placed on the Kubernetes API by the User Workload Monitoring Prometheus Operator, requiring the customer to assign more CPU than expected to the Master Nodes.
Version-Release number of selected component (if applicable)
OCP 4.16.36
How reproducible
Constant on the customer cluster
Steps to Reproduce
1. Enable User Workload Monitoring
2. Enable user-defined AlertmanagerConfig, reference a Secret in the configuration
3. Observe the API audit logs
Actual results
Observe that for each AlertmanagerConfig there are a lot of API GET request for referenced Secrets
Expected results
Only a limited amount of GET requests are made to the Kubernetes API
Additional info
- Logs and inspect files are available in the attached Support Case
- duplicates
-
OCPBUGS-27344 prometheus-operator triggering too many "updates"
-
- Closed
-
- is blocked by
-
OCPBUGS-61113 Excessive API calls by prometheus-operator ServiceAccount
-
- Verified
-
- links to