Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: odf-4.16
Component/s: Multi-Cloud Object Gateway
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Dev Approval:
?
Docs Approval:
?
PM Approval:
?
QE Approval:
?
Target Release:

odf-4.21
Intelligence Requested:
Market:

Severity:
Low

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Description of problem - Provide a detailed description of the issue encountered, including logs/command-output snippets and screenshots if the issue is observed in the UI:

The Info alert ThanosRuleHighRuleEvaluationWarnings keeps on firing in RHOCP web console.

Thanos-ruler pods streams below warnings indefinitely:
===================
$ oc project openshift-user-workload-monitoring
$ oc logs -c thanos-ruler thanos-ruler-user-workload-0

2025-08-06T12:54:32.982497925+09:00 ts=2025-08-06T03:54:32.982471601Z caller=rule.go:944 level=warn component=rules warnings="PromQL info: metric might not be a counter, name does not end in _total/_sum/_count/_bucket: \"NooBaa_providers_bandwidth_read_size\", PromQL info: metric might not be a counter, name does not end in _total/_sum/_count/_bucket: \"NooBaa_providers_bandwidth_write_size\"" query="sum by (namespace, managedBy, job, service) (rate(NooBaa_providers_bandwidth_read_size

{namespace=\"openshift-storage\"}[5m]) + rate(NooBaa_providers_bandwidth_write_size{namespace="openshift-storage"}

[5m]))"

Prometheus has a certain naming convention of counter metrics. Such metrics are supposed to be end with either of these suffixes _total/ _sum/ _count/ _bucket, which is missing here and it is inducing alert in RHOCP web console.

Other metrics from noobaa producing an errror as well.

$ omc logs -n openshift-user-workload-monitoring thanos-ruler-user-workload-0 thanos-ruler | grep -i noobaa | sed -e 's;^.*_bucket:
;;' | sort | uniq
"NooBaa_providers_bandwidth_read_size\"" query="sum by (namespace, managedBy, job, service) (rate(NooBaa_providers_bandwidth_read_size

{namespace=\"openshift-storage\"}[5m]) + rate(NooBaa_providers_bandwidth_write_size{namespace="openshift-storage"}

[5m]))"
"NooBaa_providers_bandwidth_write_size\"" query="sum by (namespace, managedBy, job, service) (rate(NooBaa_providers_bandwidth_read_size

{namespace=\"openshift-storage\"}[5m]) + rate(NooBaa_providers_bandwidth_write_size{namespace="openshift-storage"}

[5m]))"
"NooBaa_providers_ops_read_num\"" query="sum by (namespace, managedBy, job, service) (rate(NooBaa_providers_ops_read_num

{namespace=\"openshift-storage\"}[5m]) + rate(NooBaa_providers_ops_write_num{namespace="openshift-storage"}

[5m]))"
"NooBaa_providers_ops_write_num\"" query="sum by (namespace, managedBy, job, service) (rate(NooBaa_providers_ops_read_num

{namespace=\"openshift-storage\"}[5m]) + rate(NooBaa_providers_ops_write_num{namespace="openshift-storage"}

[5m]))"

The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI):

VMware

The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc):

Internal, thin-csi

The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable):

OCP: 4.16.40
ODF: 4.16.9

Does this issue impact your ability to continue to work with the product?
No

Is there any workaround available to the best of your knowledge?
No

Can this issue be reproduced? If so, please provide the hit rate
Yes, always

Expected results:
No alert should be fired.

Additional info:
Similar issues are reported in Jira for other product.

3scale: https://issues.redhat.com/browse/THREESCALE-11692
OpenTelemetry: https://issues.redhat.com/browse/TRACING-5200

Assignee:: Nimrod Becker

Reporter:: Kenichiro Kagoshima

Need Info From:: Nimrod Becker

QA Contact:: Harish NV Rao

Votes:: 0 Vote for this issue

Watchers:: 17 Start watching this issue

Created:: 2025/08/19 12:29 AM

Updated:: 2025/11/05 1:22 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty