Loading...

XML

Word

Printable

Type: Bug
Resolution: Cannot Reproduce
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: OpenTelemetry
Labels:
None

Activity Type:
Quality / Stability / Reliability
Story Points:
1
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

Sprint:
Tracing Sprint # 270

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

The Info alert ThanosRuleHighRuleEvaluationWarnings keeps on firing in RHOCP web console.

Thanos-ruler pods streams below warnings indefinitely:
===================
$ oc project openshift-user-workload-monitoring
$ oc logs -c thanos-ruler thanos-ruler-user-workload-0

oc logs thanos-ruler-user-workload-0 -n openshift-user-workload-monitoring | grep "metric might not be a counter"
ts=2025-02-26T11:59:23.265517526Z caller=rule.go:944 level=warn component=rules warnings="PromQL info: metric might not be a counter, name does not end in _total/_sum/_count/_bucket: \"otelcol_exporter_send_failed_log_records\"" query="increase(otelcol_exporter_send_failed_log_records{namespace=\"xxx-splunkotel-xxxx\"}[5m]) > 0"

Prometheus has a certain naming convention of counter metrics. Such metrics are supposed to be end with either of these suffixes _total/ _sum/ _count/ _bucket, which is missing here and it is inducing alert in RHOCP web console.

This is now enforced from 4.16 +

https://prometheus.io/docs/practices/naming/

Other issues are being raised against different product for this issue - https://issues.redhat.com/browse/THREESCALE-11692

Assignee:: Pavol Loffay

Reporter:: Nigel Smith

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2025/03/04 9:17 AM

Updated:: 2025/09/13 3:48 AM

Resolved:: 2025/05/02 12:31 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates