Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: Hosted Control Planes
Labels:
None

Target Version:
None
Activity Type:
Product / Portfolio Work
Status Summary:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Products:
None
Hierarchy Progress Bar:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Impact Score:
PX Impact Range:
None
PX Priority Data:
None
PX Technical Impact:
None
PX Technical Impact Notes:
None
PX Scheduling Request:
None

1. Proposed title of this feature request

Enable Hosted Cluster users to monitor CMO stack

2. What is the nature and description of the request?

With CMO being the stack for customers to monitor their data plane, customers also need a way to monitor that their monitoring stack works as expected.
The usual path to monitor CMO is to have an external service that receives watchdog alerts (flowing through prometheus and alertmanager) as heartbeats, and then alerting if the heartbeats stop.

This RFE requests a new feature that gives CMO users a hypershift native way to monitor their stack.

*Note: *implementation idea would be to have the hypershift stack receive the heartbeats and write metrics to the customer's cloudwatch via cloudwatch:PutMetricData. A less "hypershift native" route would be to have the customer route to an AWS lamda, and having the AWS lamda create metrics in the customer cloudwatch, which could be alerted on.

I'm looking on feedback on whether or not this is something that could serve the wider hypershift project vs being an ad-hoc self-service implementation specifically for AWS.

3. Why does the customer need this? (List the business requirements here)

Requirement to have reliable monitoring.

4. List any affected packages or components.

is related to

OCPSTRAT-1598 [Observability] Enhanced Debuggability for HyperShift Cluster Installation Failures

Backlog

OCPSTRAT-1615 [Observability] Enhanced Debuggability for HyperShift Cluster NodePool Failures

In Progress

OCPSTRAT-1852 [Observability] Improve control plane metric reporting in hosted cluster monitoring stack

In Progress

OCPSTRAT-1853 Enhanced Visibility into Control Plane and Data Plane Metrics

In Progress

Assignee:: Ramon Acedo

Reporter:: Claudio Busse

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2025/06/04 3:40 PM

Updated:: 2026/02/03 8:12 AM

Target start:: None

Target end:: None

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates