-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.20.0
-
None
-
None
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
MON Sprint 284
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
In an OpenShift bare-metal environment running OpenShift Virtualization with multiple Hosted Control Plane (HCP) clusters, the PrometheusPossibleNarrowSelectors alert triggers repeatedly against the prometheus-k8s monitoring stack.
The alert indicates that Prometheus may be using overly narrow label selectors; however, in this deployment model:
- Only default, cluster-wide monitoring components are deployed
- No custom Prometheus selectors or queries are configured by the user
- Metrics scraping and querying are functioning as expected
The alert appears to be a false positive in this architecture and does not provide actionable remediation.
Version-Release number of selected component (if applicable):{}
- OpenShift Container Platform (bare metal)
- 4.20
How reproducible:
No
Steps to Reproduce:
1.
2.
3.
Actual results:
- PrometheusPossibleNarrowSelectors alert fires repeatedly
- Alert suggests potential misconfiguration of Prometheus selectors
- No observable impact to metrics availability or monitoring functionality
- Alert creates continuous noise in production monitoring
Expected results:
- The alert should not trigger in environments where:
- Selectors are expected and valid due to HCP / virtualization architecture
- No metric loss or scrape failures are occurring - Alternatively:
- The alert logic should be HCP-aware
- Or the alert description/runbook should clearly state that this can be safely ignored in hosted control plane environments
Additional info:
The public runbook for this alert
https://github.com/openshift/runbooks/blob/master/alerts/cluster-monitoring-operator/PrometheusPossibleNarrowSelectors.md
does not currently provide actionable mitigation steps for this scenario.
From the logs, it seems like Prometheus is generating some TLS errors:
$ oc logs prometheus-k8s-0 --all-containers 2026-02-03T09:01:57.446232128Z level=error ts=2026-02-03T09:01:57.446170604Z caller=runutil.go:117 msg="function failed. Retrying in next tick" err="trigger reload: reload request failed: Post \"http://localhost:9090/-/reload\": dial tcp [::1]:9090: connect: connection refused" 2026-02-03T09:03:00.448152231Z I0203 09:03:00.448105 1 log.go:245] http: TLS handshake error from 10.128.x.x:34650: write tcp 10.128.x.x:9091->10.128.x.x:34650: write: connection reset by peer 2026-02-03T09:03:00.623923024Z I0203 09:03:00.623880 1 log.go:245] http: TLS handshake error from 10.128.x.x:34658: write tcp 10.128.x.x:9091->10.128.x.x:34658: write: connection reset by peer 2026-02-03T09:03:02.960755724Z I0203 09:03:02.960685 1 log.go:245] http: TLS handshake error from 10.128.x.x:52416: write tcp 10.128.x.x:9091->10.128.x.x:52416: write: connection reset by peer 2026-02-03T09:03:03.136280909Z I0203 09:03:03.136232 1 log.go:245] http: TLS handshake error from 10.128.72.2:52430: write tcp 10.128.x.x:9091->10.128.x.x:52430: write: connection reset by peer 2026-02-03T09:03:05.451693105Z I0203 09:03:05.451640 1 log.go:245] http: TLS handshake error from 10.128.x.x:34674: write tcp 10.128.x.x:9091->10.128.x.x:34674: write: connection reset by peer
In the prometheus-user-workload-0 pod logs, I can see the below error message:
$ oc logs prometheus-user-workload-0 -n openshift-user-workload-monitoring -c kube-rbac-proxy-federate 2026-02-02T16:02:40.028260923Z I0202 16:02:40.028098 1 kube-rbac-proxy.go:532] Reading config file: /etc/kube-rbac-proxy/config.yaml 2026-02-02T16:02:40.028744258Z I0202 16:02:40.028726 1 kube-rbac-proxy.go:235] Valid token audiences: 2026-02-02T16:02:40.029048967Z I0202 16:02:40.028992 1 dynamic_cafile_content.go:161] "Starting controller" name="client-ca::/etc/tls/client/client-ca.crt" 2026-02-02T16:02:40.029612553Z I0202 16:02:40.029588 1 kube-rbac-proxy.go:349] Reading certificate files 2026-02-02T16:02:40.029916962Z I0202 16:02:40.029859 1 kube-rbac-proxy.go:397] Starting TCP socket on 0.0.0.0:9092 2026-02-02T16:02:40.030092003Z I0202 16:02:40.030084 1 kube-rbac-proxy.go:404] Listening securely on 0.0.0.0:9092 2026-02-02T16:03:02.782391572Z I0202 16:03:02.782322 1 log.go:245] http: TLS handshake error from 10.128.x.x:54844: write tcp 10.128.x.x:9092->10.128.x.x:54844: write: connection reset by peer 2026-02-02T16:03:02.782522312Z I0202 16:03:02.782504 1 log.go:245] http: TLS handshake error from 10.128.x.x:38404: write tcp 10.128.x.x:9092->10.128.x.x:38404: write: connection reset by peer 2026-02-02T16:03:07.786593531Z I0202 16:03:07.786529 1 log.go:245] http: TLS handshake error from 10.128.x.x:54850: write tcp 10.128.x.x:9092->10.128.x.x:54850: write: connection reset by peer 2026-02-02T16:03:12.789711007Z I0202 16:03:12.789652 1 log.go:245] http: TLS handshake error from 10.128.x.x:56786: write tcp 10.128.x.x:9092->10.128.x.x:56786: write: connection reset by peer
- links to