Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 2.15.2 GA
Component/s: 3scale Operator, Gateway
Labels:
- support

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
3Scale PT Tested upstream:
Not Started
3scale PT Docs:
Not Started
3scale PT Product Specs:
Not Started
3scale PT Product Update Ready:
Not Started
3scale PT Released In Saas:
Not Started
3scale PT Verified Product:
Not Started
Target Release:

2.17.0 GA
Intelligence Requested:
Market:

Severity:
Important

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

The Info alert ThanosRuleHighRuleEvaluationWarnings keeps on firing in RHOCP web console.

Thanos-ruler pods streams below warnings indefinitely:
===================
$ oc project openshift-user-workload-monitoring
$ oc logs -c thanos-ruler thanos-ruler-user-workload-0
...

ts=2025-02-12T11:09:17.848275853Z caller=rule.go:944 level=warn component=rules warnings="PromQL info: metric might not be a counter, name does not end in _total/_sum/_count/_bucket: \"apicast_status\"" query="sum(rate(apicast_status{namespace=\"3scale\",status=~\"^4..\"}[1m])) / sum(rate(apicast_status{namespace=\"3scale\"}[1m])) * 100 > 5"


===================

The metric "apicast_status" is scraped from apicast related components and it is a counter metric as per below documentation:
[-] https://docs.redhat.com/en/documentation/red_hat_3scale_api_management/2.15/html-single/administering_the_api_gateway/index#prometheus-3scale-metrics 

Prometheus has a certain naming convention of counter metrics. Such metrics are supposed to be end with either of these suffixes _total/ _sum/ _count/ _bucket, which is missing here and it is inducing alert in RHOCP web console.

Version-Release number of selected component (if applicable):

Tested on RHOCP 4.16.z
3Scale version 2.15.2

How reproducible:

100%

Steps to Reproduce:

    1. Enable user workload monitoring
    2. Allow creation of required ServiceMonitor and PrometheusRules
    3. Setup 3scale application and create the required workload to scrape "apicast_status" metric.     
    4. Wait for 15 mins and see if alert ThanosRuleHighRuleEvaluationWarnings starts to stream in "Observe > Alerting" menu.
    5. Logs of thanos-ruler pods running in openshift-user-workload-monitoring can be checked to validate the situation as well.

Actual results:

apicast_status metric is a counter metric and is not following the naming convention as per Prometheus standards which causes ThanosRuleHighRuleEvaluationWarnings to fire in RHOCP.

Expected results:

The metric apicast_status should be renamed as per Prometheus nameing standards and with end with  either of these suffixes _total/_sum/_count/_bucket. For eg apicast_status_total or apicast_status_sum or apicast_status_count or apicast_status_bucket

Additional info:

Similar issue was reported on github as well which was closed stating the same reason of Prometheus's naming convention.
[-] https://github.com/canonical/grafana-k8s-operator/issues/316

links to

KCS

Assignee:: Unassigned

Reporter:: Dhruv Gautam

Votes:: 2 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2025/02/12 11:48 AM

Updated:: 2026/01/07 4:25 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates