Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.16
Component/s: Monitoring
Labels:

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Low
Regression:
None

Target Backport Versions:
None
Target Version:

4.22.0
Release Blocker:
None
Sprint:
MON Sprint 284
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
In Progress
Release Note Type:
Bug Fix
Release Note Text:
Corrects a regression from 4.15.0 that caused AlertingRule to create duplicate alerts.

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

 
"duplicate" PrometheusRule objects observed. 

~~~    
oc get -n openshift-monitoring prometheusrules.monitoring.coreos.com
NAME                                           AGE
alertmanager-main-rules                        2y149d
bis-custom-alertingrules-bd731a                153d
bis-custom-alertingrules-d4367e                509d
~~~

one with 

~~~
labels:
    app.kubernetes.io/component: alerting-rules-controller
    app.kubernetes.io/name: cluster-monitoring-operator
    app.kubernetes.io/part-of: openshift-monitoring
    app.kubernetes.io/version: 4.16.38
    prometheus: k8s
    role: alerting-rules
~~~

the other one with 

~~~
labels:
    app.kubernetes.io/component: alerting-rules-controller
    app.kubernetes.io/name: cluster-monitoring-operator
    app.kubernetes.io/part-of: openshift-monitoring
    app.kubernetes.io/version: 4.14.29
    prometheus: k8s
    role: alerting-rules
~~~

they both have an ownerReferences pointing to the same uid 8a7ef0f2-db18-4c10-9a56-3df02e4885a7
The rules have changed over time, and the customer observe that in the prometheus-k8s-rulefiles-0 configmap, managed by the operator, there are 2 versions of some of those rules

Version-Release number of selected component (if applicable):

 4.16

How reproducible:

   Unable to reproduce

Steps to Reproduce:

Actual results:

Example: the bis-PersistentVolumeUsageCritical alert definition
It's defined only once in the Alertingrules config, but found twice in the prometheus-k8s-rulefiles-0 configmap
The order of those duplicate entries in the prometheus-k8s-rulefiles-0 configmap changes between clusters, so effectively in some clusters we are using the new version of the rule, while in others we are still using the old version of the rule.

Expected results:

Updated rules should replace existing rules

Additional info:

blocks

OCPBUGS-77271 Duplicate `prometheusrules.monitoring.coreos.com`

is cloned by

OCPBUGS-77271 Duplicate `prometheusrules.monitoring.coreos.com`

links to

openshift/cluster-monitoring-operator#2820: OCPBUGS-61262: AlertingRule: fix duplicate PrometheusRules after MD5->SHA-224 naming change

openshift/cluster-monitoring-operator#2829: OCPBUGS-61262: AlertingRule: fix duplicate PrometheusRules after MD5->SHA-224 naming change

Assignee:: Ayoub Mrini

Reporter:: Nigel Smith

Need Info From:: None

Contributors:: None

QA Contact:: Junqi Zhao

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2025/09/04 3:36 PM

Updated:: 2026/02/25 6:25 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates