-
Feature Request
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
None
-
Product / Portfolio Work
-
None
-
False
-
-
None
-
None
-
None
-
-
None
-
None
-
None
-
None
-
None
1. Proposed title of this feature request: Cluster Autoscaler Metrics
2. What is the nature and description of the request? CAS can be configured as documented here: https://hypershift-docs.netlify.app/reference/api/#hypershift.openshift.io/v1beta1.ClusterAutoscaling Ask is to enhance the metrics so that changes to CAS via its API is observable.
- Limit-Change Success Rate - The ratio of successful limit changes (e.g., from user request to the cluster auto-scaler) over total attempted changes.
- Time to Apply New Limits - The latency from when a limit change request is initiated to when the CAS actually enforces the new limits (i.e., the cluster’s autoscaler has updated values successfully).
- Validation Accuracy - How often the cluster remains within the configured min/max node bounds versus how often it drifts outside those bounds or violates them due to delayed/failed autoscaler updates.
- Autoscaler API Availability - The success rate (e.g., HTTP 2xx or gRPC OK) for API calls that set or update the cluster-autoscaler limits, from the OCM perspective.
3. Why does the customer need this? (List the business requirements here) Service providers like ROSA offering HCP to cluster administrators need to monitor decisions of CAS and get alerted based on SLOs.
4. List any affected packages or components.
Hosted Control Planes
- is caused by
-
OCPSTRAT-1806 New API to support specifying cluster-autoscaler flags
-
- Release Pending
-
- relates to
-
OCPSTRAT-1853 Enhanced Visibility into Control Plane and Data Plane Metrics
-
- Refinement
-