-
Bug
-
Resolution: Done-Errata
-
Normal
-
rhos-16.2.z
-
2
-
False
-
False
-
No Docs Impact
-
openstack-tripleo-heat-templates-11.6.1-2.20241220151710.9adcac6.el8ost
-
None
-
-
-
Approved
-
CloudOps 2024 Sprint 22, CloudOps 2024 Sprint 23, CloudOps 2024 Sprint 24, CloudOps 2024 Sprint 25, CloudOps 2025 Sprint 6
-
5
-
Moderate
Description of problem:
Customer created a custom alert like this:
~~~
- alert: OpenStack Service is Down
annotations:
summary: ' {{ $labels.process }} down on {{ $labels.host }} '
expr: 'sensubility_container_health_status {process!="metrics_qdr"}== 0'
for: 2m
labels:
severity: critical
~~~
If we base ourselves on the metric from iscsid service from controller-0, even that the metric is steady, the alert flips continuously between firing and pending.
Metric can be retrieved with:
sensubility_container_health_status
{container="sg-core", endpoint="prom-https", host="controller-0.localdomain", process="iscsid", service="stf1-xxxxxx-sens-meter"}And alert status flipping with:
ALERTS
{alertname="OpenStack Service is Down", container="sg-core", endpoint="prom-https", host="controller-0.localdomain", process="iscsid", service="stf1-xxxxxx-sens-meter", severity="critical"}Version-Release number of selected component (if applicable):
RHOSP 16.2
STF 1.5
How reproducible:
The alert works as expected in one of our labs.
Additional info:
More info in private comments.
- external trackers
- links to
-
RHBA-2025:145523 Red Hat OpenStack Platform 16.2 bug fix advisory
- mentioned in
-
Page Loading...