-
Bug
-
Resolution: Unresolved
-
Undefined
-
4.20
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
None
-
None
-
Approved
-
MON Sprint 276, MON Sprint 277
-
2
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
This is a clone of issue OCPBUGS-61193. The following is the description of the original issue:
—
(Feel free to update this bug's summary to be more specific.)
Component Readiness has found a potential regression in the following test:
[sig-instrumentation][Late] Platform Prometheus targets should not be accessible without auth [Serial] [Suite:openshift/conformance/serial]
Test has a 94.50% pass rate, but 95.00% is required.
Sample (being evaluated) Release: 4.20
Start Time: 2025-08-27T00:00:00Z
End Time: 2025-09-03T08:00:00Z
Success Rate: 94.50%
Successes: 103
Failures: 6
Flakes: 0
Base (historical) Release: 4.19
Start Time: 2025-05-18T00:00:00Z
End Time: 2025-06-17T23:59:59Z
Success Rate: 0.00%
Successes: 0
Failures: 0
Flakes: 0
View the test details report for additional context.
Above link is for metal but we also see this test below the required 95% on vsphere.
The failure always looks similar to:
https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.20-e2e-metal-ipi-ovn-serial-virtualmedia-2of2/1962616253063368704
[sig-instrumentation][Late] Platform Prometheus targets should not be accessible without auth [Serial] [Suite:openshift/conformance/serial] expand_less 13m46s { fail [github.com/openshift/origin/test/extended/prometheus/prometheus.go:143]: Expected <[]error | len:4, cap:4>: [ <*fmt.wrapError | 0xc000ef8080>{ msg: "the scrape url https://192.168.111.26:10250/metrics for pod kube-system/ is accessible without authorization: context deadline exceeded", err: <context.deadlineExceededError>{}, }, <*fmt.wrapError | 0xc000ef8060>{ msg: "the scrape url https://192.168.111.26:10250/metrics/cadvisor for pod kube-system/ is accessible without authorization: context deadline exceeded", err: <context.deadlineExceededError>{}, }, <*fmt.wrapError | 0xc000ce8020>{ msg: "the scrape url https://192.168.111.26:10250/metrics/probes for pod kube-system/ is accessible without authorization: context deadline exceeded", err: <context.deadlineExceededError>{}, }, <*fmt.wrapError | 0xc000dd2020>{ msg: "the scrape url https://192.168.111.26:9637/metrics for pod kube-system/ is accessible without authorization: context deadline exceeded", err: <context.deadlineExceededError>{}, }, ] to be empty}
The fact the test is interpreting context deadline exceeded as "accessible without auth" may indicate a logic problem with the test.
In this case, the kube-system/ no pod name is suspicious.
In other runs, it seems to be complaining about other pods such as in https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.20-e2e-vsphere-ovn-serial/1960598585972101120
[sig-instrumentation][Late] Platform Prometheus targets should not be accessible without auth [Serial] [Suite:openshift/conformance/serial] expand_less 13m24s { fail [github.com/openshift/origin/test/extended/prometheus/prometheus.go:143]: Expected <[]error | len:4, cap:4>: [ <*fmt.wrapError | 0xc002ce6020>{ msg: "the scrape url https://10.93.152.111:9001/metrics for pod openshift-machine-config-operator/machine-config-daemon-m4j6c is accessible without authorization: context deadline exceeded", err: <context.deadlineExceededError>{}, }, <*fmt.wrapError | 0xc002ce6040>{ msg: "the scrape url https://10.93.152.111:9100/metrics for pod openshift-monitoring/node-exporter-ts9kq is accessible without authorization: context deadline exceeded", err: <context.deadlineExceededError>{}, }, <*fmt.wrapError | 0xc002ce6080>{ msg: "the scrape url https://10.93.152.111:9103/metrics for pod openshift-ovn-kubernetes/ovnkube-node-7j7z4 is accessible without authorization: context deadline exceeded", err: <context.deadlineExceededError>{}, }, <*fmt.wrapError | 0xc00119e040>{ msg: "the scrape url https://10.93.152.111:9105/metrics for pod openshift-ovn-kubernetes/ovnkube-node-7j7z4 is accessible without authorization: context deadline exceeded", err: <context.deadlineExceededError>{}, }, ] to be empty}
The pods referenced here seem to not exist in the final artifacts collected from the cluster, which makes me wonder if it's being influenced by the tests which add nodes temporarily.
Some jobs get a mixture of those missing pods, and the kube-system errors.
Filed by: dgoodwin@redhat.com
- clones
-
OCPBUGS-61193 [Monitoring] New prometheus targets auth test failing too often
-
- Verified
-
- is blocked by
-
OCPBUGS-61193 [Monitoring] New prometheus targets auth test failing too often
-
- Verified
-
- links to