-
Feature Request
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
False
-
None
-
False
-
Not Selected
-
-
1. Proposed title of this feature request
Report metrics on node startup times
2. What is the nature and description of the request?
The SRE team and customers are challenged with managing clusters in different environments, cloud/bare metal platforms, and network setups (e.g., egress restrictions, cluster-wide proxy) that affect node startup times in different ways.
We currently don't have a straightforward way to report metrics on those times across the fleet, potentially segmenting by the abovementioned scenarios. Decisions around customizing Node Startup Timeouts are made without structured data analysis and rely on ad hoc log/cloud-event analysis.
3. Why does the customer need this? (List the business requirements here)
There are multiple customers (in both ROSA and ARO) inquiring about node startup times and timeouts in different setups/clouds, and both SRE leads and customer stakeholders miss the proper analytical support.
4. List any affected packages or components.
Machine API Operator
cc jboutaud@redhat.com rh-ee-adejong acathrow@redhat.com cblecker.openshift mpatel1@redhat.com rhn-support-sdodson