-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.16
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Low
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
"duplicate" PrometheusRule objects observed.
~~~
oc get -n openshift-monitoring prometheusrules.monitoring.coreos.com
NAME AGE
alertmanager-main-rules 2y149d
bis-custom-alertingrules-bd731a 153d
bis-custom-alertingrules-d4367e 509d
~~~
one with
~~~
labels:
app.kubernetes.io/component: alerting-rules-controller
app.kubernetes.io/name: cluster-monitoring-operator
app.kubernetes.io/part-of: openshift-monitoring
app.kubernetes.io/version: 4.16.38
prometheus: k8s
role: alerting-rules
~~~
the other one with
~~~
labels:
app.kubernetes.io/component: alerting-rules-controller
app.kubernetes.io/name: cluster-monitoring-operator
app.kubernetes.io/part-of: openshift-monitoring
app.kubernetes.io/version: 4.14.29
prometheus: k8s
role: alerting-rules
~~~
they both have an ownerReferences pointing to the same uid 8a7ef0f2-db18-4c10-9a56-3df02e4885a7
The rules have changed over time, and the customer observe that in the prometheus-k8s-rulefiles-0 configmap, managed by the operator, there are 2 versions of some of those rules
Version-Release number of selected component (if applicable):
4.16
How reproducible:
Unable to reproduce
Steps to Reproduce:
Actual results:
Example: the bis-PersistentVolumeUsageCritical alert definition It's defined only once in the Alertingrules config, but found twice in the prometheus-k8s-rulefiles-0 configmap The order of those duplicate entries in the prometheus-k8s-rulefiles-0 configmap changes between clusters, so effectively in some clusters we are using the new version of the rule, while in others we are still using the old version of the rule.
Expected results:
Updated rules should replace existing rules
Additional info: