Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Normal
Fix Version/s: 4.18.0
Affects Version/s: 4.13, 4.12, 4.14, 4.15
Component/s: Cluster Autoscaler
Labels:
- cluster-autoscaler
- pre-merge-tested

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Moderate
Regression:
No

Target Backport Versions:
None
Target Version:

4.18.0
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
Done
Release Note Type:
Bug Fix
Release Note Text:

Hide
* Previously, some cluster autoscaler metrics were not initialized, and therefore were not available. With this release, these metrics are initialized and available. (link:https://issues.redhat.com/browse/OCPBUGS-46416[*~~OCPBUGS-46416~~*])

Show
* Previously, some cluster autoscaler metrics were not initialized, and therefore were not available. With this release, these metrics are initialized and available. (link: https://issues.redhat.com/browse/OCPBUGS-46416 [* OCPBUGS-46416 *])

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

This is a clone of issue ~~OCPBUGS-25852~~. The following is the description of the original issue:
—
Description of problem:

Missing metrics - example: cluster_autoscaler_failed_scale_ups_total

Version-Release number of selected component (if applicable):

How reproducible:

Always

Steps to Reproduce:

#curl the autoscalers metrics endpoint: 

$ oc exec deployment/cluster-autoscaler-default -- curl -s http://localhost:8085/metrics | grep cluster_autoscaler_failed_scale_ups_total

Actual results:

the metrics does not return a value until an event has happened

Expected results:

The metric counter should be initialized at start up providing a zero value

Additional info:

I have been through the file: 

https://raw.githubusercontent.com/openshift/kubernetes-autoscaler/master/cluster-autoscaler/metrics/metrics.go 

and checked off the metrics that do not appear when scraping the metrics endpoint straight after deployment. 

the following metrics are in metrics.go but are missing from the scrape

~~~
node_group_min_count
node_group_max_count
pending_node_deletions
errors_total
scaled_up_gpu_nodes_total
failed_scale_ups_total
failed_gpu_scale_ups_total
scaled_down_nodes_total
scaled_down_gpu_nodes_total
unremovable_nodes_count 
skipped_scale_events_count
~~~

clones

OCPBUGS-25852 Missing metric - example: cluster_autoscaler_failed_scale_ups_total

Closed

is blocked by

OCPBUGS-25852 Missing metric - example: cluster_autoscaler_failed_scale_ups_total

Closed

is cloned by

OCPBUGS-48606 Missing metric - example: cluster_autoscaler_failed_scale_ups_total

Closed

is depended on by

OCPBUGS-48606 Missing metric - example: cluster_autoscaler_failed_scale_ups_total

Closed

links to

openshift/kubernetes-autoscaler#333: [release-4.18] OCPBUGS-46416: UPSTREAM: <carry>: 🐛(metrics) Initialize metrics for autoscaler errors, scale events, and pod evictions

RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update

(1 links to)

Assignee:: Theo Barber-Bany

Reporter:: OpenShift Prow Bot

QA Contact:: Milind Yadav

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2024/12/13 4:07 PM

Updated:: 2025/07/17 1:32 PM

Resolved:: 2025/02/25 4:52 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates