-
Bug
-
Resolution: Done-Errata
-
Normal
-
None
-
4.14
-
Low
-
No
-
False
-
-
N/A
-
Release Note Not Required
Description of problem
CI is flaky because of test failures such as the following:
[sig-storage] PersistentVolumes-local Stress with local volumes [Serial] should be able to process many pods and reuse local volumes [Suite:openshift/conformance/serial] [Suite:k8s] { fail [test/e2e/storage/persistent_volumes-local.go:522]: persistentvolumes "local-pvlh4qq" not found Error: exit with code 1 Ginkgo exit error 1: exit with code 1}
This particular failure comes from https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/927/pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-ovn-serial/1668289389911871488. Search.ci has other similar failures.
Version-Release number of selected component (if applicable)
I have seen this in 4.14 CI jobs.
How reproducible
Presently, search.ci shows the following stats for the past 7 days:
Found in 0.01% of runs (0.05% of failures) across 127886 total runs and 7151 jobs (17.99% failed)
pull-ci-openshift-oc-master-e2e-aws-ovn-serial (all) - 69 runs, 33% failed, 9% of failures match = 3% impact
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serial (all) - 27 runs, 59% failed, 25% of failures match = 15% impact
pull-ci-openshift-cluster-ingress-operator-master-e2e-aws-ovn-serial (all) - 16 runs, 44% failed, 14% of failures match = 6% impact
openshift-openshift-apiserver-371-nightly-4.14-e2e-aws-ovn-single-node-serial (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
pull-ci-openshift-origin-master-e2e-aws-ovn-serial (all) - 38 runs, 34% failed, 8% of failures match = 3% impact
openshift-cluster-config-operator-307-nightly-4.14-e2e-aws-sdn-serial (all) - 1 runs, 100% failed, 100% of failures match = 100% impact
pull-ci-openshift-cluster-network-operator-master-e2e-aws-ovn-serial (all) - 17 runs, 24% failed, 25% of failures match = 6% impact
Steps to Reproduce
1. Post a PR and have bad luck.
2. Check search.ci: https://search.ci.openshift.org/?search=fail+%5C%5Btest%2Fe2e%2Fstorage%2Fpersistent_volumes-local%5C.go&maxAge=168h&context=1&type=bug%2Bissue%2Bjunit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job
Actual results
CI fails.
Expected results
CI passes, or fails on some other test failure.
Additional info
In the search.ci results, the failing jobs all appear to be for the AWS platform. SNO seems to be especially impacted.
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update