Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: 4.21.0
Affects Version/s: 4.21
Component/s: HyperShift
Labels:
- issue-for-agent
- triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:

4.21.0
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem

hypershift dump collects many resources, but currently not servicemonitors.monitoring.coreos.com. The control-plane operator creates several ServiceMonitors, such as this one, and dump should learn how to collect them.

Version-Release number of selected component

Seen in 4.20-era CI (see Additional info below), and confirmed in modern HyperShift code (see the Description of problem above).

How reproducible

Every time.

Steps to Reproduce

1. Take a dump.
2. Check the gathered groups in the hosted namespace.

Actual results

Lots of entries like apps, batch, and core.

Expected results

A monitoring.coreos.com directory with ServiceMonitor (and PodMonitor?) children.

Additional info

While investigating OCPBUGS-62851, I was looking at e2e-hypershift output in the regressing pull. And while gathered artifacts show the management cluster's Prometheus complaining about... something in the host cluster ServiceMonitors:

$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_cluster-version-operator/1215/pull-ci-openshift-cluster-version-operator-main-e2e-hypershift/1952739873462947840/artifacts/e2e-hypershift/dump-management-cluster/artifacts/artifacts.tar | tar -xOz logs/artifacts/output/hostedcluster-d44932313dd1be2d3560-mgmt/namespaces/openshift-monitoring/pods/prometheus-k8s-0/prometheus/prometheus/logs/current.log | grep cluster-version-operator
2025-08-05T15:53:48.316469617Z time=2025-08-05T15:53:48.316Z level=ERROR source=manager.go:176 msg="error reloading target set" component="scrape manager" err="invalid config id:serviceMonitor/e2e-clusters-ghd95-node-pool-6dl4k/cluster-version-operator/0"
2025-08-05T15:53:48.316543150Z time=2025-08-05T15:53:48.316Z level=ERROR source=manager.go:176 msg="error reloading target set" component="scrape manager" err="invalid config id:serviceMonitor/e2e-clusters-bmg8g-proxy-jplkn/cluster-version-operator/0"
2025-08-05T15:53:48.316617911Z time=2025-08-05T15:53:48.316Z level=ERROR source=manager.go:176 msg="error reloading target set" component="scrape manager" err="invalid config id:serviceMonitor/e2e-clusters-qnv7p-create-cluster-sxsvl/cluster-version-operator/0"

There was nothing about that ServiceMonitor, e.g. in the e2e-clusters-qnv7p-create-cluster-sxsvl dump.

links to

openshift/hypershift#6970: OCPBUGS-62863: feat(cmd): collect ServiceMonitor and PodMonitor resources in dump

Assignee:: Seth Jennings

Reporter:: W. Trevor King

Need Info From:: None

Contributors:: None

QA Contact:: XiuJuan Wang

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2025/10/08 11:58 PM

Updated:: 2025/10/15 4:27 PM

Details

Description

Description of problem

Version-Release number of selected component

How reproducible

Steps to Reproduce

Actual results

Expected results

Additional info

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates