-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.20
-
None
-
None
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Upstream issue: https://github.com/prometheus/alertmanager/issues/4064
Have 2 alerts firing for a long time, and configured the inhibition rule in such a way that one of the alerts inhibits the other one.
What did you expect to see?
The Alertmanager does not send a notification for the inhibited alert if I restart/reload it.
What did you see instead? Under which circumstances?
The Alertmanager sent a notification for the alert, which should have been inhibited right away once it received the alert from Prometheus.
Reproducer - going against best practice of inhibiting across rule groups
Configured against a user workload alertmanager instance - to be the only thing in the logs mainly
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
enableUserWorkload: true
---
apiVersion: v1
kind: ConfigMap
metadata:
name: user-workload-monitoring-config
namespace: openshift-user-workload-monitoring
data:
config.yaml: |
alertmanager:
enabled: true
enableAlertmanagerConfig: true
logLevel: debug
---
apiVersion: v1
kind: Namespace
metadata:
name: ns1
---
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: prometheus-example-app
name: prometheus-example-app
namespace: ns1
spec:
replicas: 1
selector:
matchLabels:
app: prometheus-example-app
template:
metadata:
labels:
app: prometheus-example-app
spec:
containers:
- image: ghcr.io/rhobs/prometheus-example-app:0.4.1
imagePullPolicy: IfNotPresent
name: prometheus-example-app
---
apiVersion: v1
kind: Service
metadata:
labels:
app: prometheus-example-app
name: prometheus-example-app
namespace: ns1
spec:
ports:
- port: 8080
protocol: TCP
targetPort: 8080
name: web
selector:
app: prometheus-example-app
type: ClusterIP
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
labels:
k8s-app: prometheus-example-monitor
name: prometheus-example-monitor
namespace: ns1
spec:
endpoints:
- interval: 30s
port: web
scheme: http
selector:
matchLabels:
app: prometheus-example-app
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: example-alert
namespace: ns1
spec:
groups:
- name: Example_group_1
rules:
- alert: Inhibiting rule
expr: version{job="prometheus-example-app"} > 0
for: 5m
labels:
inhibit: "true"
annotations:
summary: "This is an inhibiting rule"
- alert: Inhibited rule
expr: version{job="prometheus-example-app"} > 0
for: 5m
labels:
inhibited: "true"
annotations:
summary: "This is an inhibited rule"
- name: Example group 2
rules:
- alert: Another inhibited rule
expr: version{job="prometheus-example-app"} > 0
for: 5m
labels:
inhibited: "true"
annotations:
summary: "This is another inhibited rule"
AlertmanagerConfig:
"global": "http_config": "proxy_from_environment": true "inhibit_rules": - "target_matchers": - "inhibited = true" "source_matchers": - "inhibit = true" "receivers": - "name": "test" "route": "receiver": "test" "group_wait": "15s" "group_interval": "1m" "repeat_interval": "5m"
apply the alertmanger config to the userworkload instance:
oc -n openshift-user-workload-monitoring create secret generic alertmanager-user-workload --from-file=alertmanager.yaml --dry-run -o=yaml | oc -n openshift-user-workload-monitoring replace secret --filename=-
On alertmanager restart debug logs show inhibited alerts becoming active...before being inhibited
time=2026-02-13T15:33:17.410Z level=INFO source=cluster.go:691 msg="gossip settled; proceeding" component=cluster elapsed=10.003040452s time=2026-02-13T15:33:22.412Z level=DEBUG source=net.go:962 msg="[DEBUG] memberlist: Initiating push/pull sync with: 10.128.2.26:9094" component=cluster time=2026-02-13T15:33:22.413Z level=DEBUG source=delegate.go:238 msg=NotifyJoin component=cluster node=01KHBT4TSTBJB2W2PGTJAPCXHZ addr=10.128.2.26:9094 time=2026-02-13T15:33:22.413Z level=DEBUG source=cluster.go:467 msg=success component=cluster msg=refresh addr=10.128.2.26:9094 time=2026-02-13T15:33:40.569Z level=DEBUG source=dispatch.go:165 msg="Received alert" component=dispatcher alert="Another inhibited rule[8f0c1e1][active]" time=2026-02-13T15:33:40.570Z level=DEBUG source=dispatch.go:530 msg=flushing component=dispatcher aggrGroup={}:{} alerts="[Another inhibited rule[8f0c1e1][active]]" time=2026-02-13T15:33:51.875Z level=DEBUG source=dispatch.go:165 msg="Received alert" component=dispatcher alert="Inhibiting rule[ab3b3d0][active]" time=2026-02-13T15:33:51.876Z level=DEBUG source=dispatch.go:165 msg="Received alert" component=dispatcher alert="Inhibited rule[ef5d27a][active]" time=2026-02-13T15:33:51.884Z level=DEBUG source=dispatch.go:165 msg="Received alert" component=dispatcher alert="Inhibiting rule[ab3b3d0][active]" time=2026-02-13T15:33:51.885Z level=DEBUG source=dispatch.go:165 msg="Received alert" component=dispatcher alert="Inhibited rule[ef5d27a][active]" time=2026-02-13T15:33:55.556Z level=DEBUG source=dispatch.go:165 msg="Received alert" component=dispatcher alert="Another inhibited rule[8f0c1e1][active]" time=2026-02-13T15:34:18.781Z level=DEBUG source=net.go:238 msg="[DEBUG] memberlist: Stream connection from=10.128.2.26:43758" component=cluster time=2026-02-13T15:34:40.571Z level=DEBUG source=dispatch.go:530 msg=flushing component=dispatcher aggrGroup={}:{} alerts="[Another inhibited rule[8f0c1e1][active] Inhibited rule[ef5d27a][active] Inhibiting rule[ab3b3d0][active]]" time=2026-02-13T15:34:40.571Z level=DEBUG source=notify.go:579 msg="Notifications will not be sent for muted alerts" component=dispatcher alerts="[Another inhibited rule[8f0c1e1][active] Inhibited rule[ef5d27a][active]]" reason=inhibition
If the inhibit rule does not span rule groups and the inhibiting rule comes first then the inhibited rule does not fire
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: example-alert
namespace: ns1
spec:
groups:
- name: example
rules:
- alert: Inhibiting rule
expr: version{job="prometheus-example-app"} > 0
for: 5m
labels:
inhibit: "true"
annotations:
summary: "This is an inhibiting rule"
- alert: Inhibited rule
expr: version{job="prometheus-example-app"} > 0
for: 5m
labels:
inhibited: "true"
annotations:
summary: "This is an inhibited rule"
- links to