-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.13, 4.14
-
Quality / Stability / Reliability
-
False
-
-
3
-
Moderate
-
No
-
None
-
None
-
Rejected
-
Sprint 244, Sprint 245, Sprint 246, Sprint 247, Sprint 248, Sprint 249, Sprint 250, Sprint 251, Sprint 252, Sprint 253, Sprint 254, NE Sprint 255, NE Sprint 256, NE Sprint 257, NE Sprint 258, NE Sprint 259, NE Sprint 260, NE Sprint 261, NE Sprint 262, NE Sprint 263, NE Sprint 264
-
21
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem
CI is flaky because of test failures such as the following:
TestAll/parallel/TestManagedDNSToUnmanagedDNSIngressController === RUN TestAll/parallel/TestManagedDNSToUnmanagedDNSIngressController util_test.go:106: retrying client call due to: Get "http://168.61.75.99": context deadline exceeded (Client.Timeout exceeded while awaiting headers) util_test.go:106: retrying client call due to: Get "http://168.61.75.99": context deadline exceeded (Client.Timeout exceeded while awaiting headers) util_test.go:106: retrying client call due to: Get "http://168.61.75.99": context deadline exceeded (Client.Timeout exceeded while awaiting headers) util_test.go:106: retrying client call due to: Get "http://168.61.75.99": context deadline exceeded (Client.Timeout exceeded while awaiting headers) util_test.go:106: retrying client call due to: Get "http://168.61.75.99": context deadline exceeded (Client.Timeout exceeded while awaiting headers) util_test.go:551: verified connectivity with workload with req http://168.61.75.99 and response 200 unmanaged_dns_test.go:148: Updating ingresscontroller managed-migrated to dnsManagementPolicy=Unmanaged unmanaged_dns_test.go:161: Waiting for stable conditions on ingresscontroller managed-migrated after dnsManagementPolicy=Unmanaged unmanaged_dns_test.go:177: verifying conditions on DNSRecord zone {ID:/subscriptions/d38f1e38-4bed-438e-b227-833f997adf6a/resourceGroups/ci-op-k8s8zfit-04a70-rdnbw-rg/providers/Microsoft.Network/privateDnsZo nes/ci-op-k8s8zfit-04a70.ci.azure.devcluster.openshift.com Tags:map[]} unmanaged_dns_test.go:177: DNSRecord zone expected to have status=Unknown but got status=True panic.go:522: deleted ingresscontroller managed-migrated
This particular failure comes from https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/970/pull-ci-openshift-cluster-ingress-operator-master-e2e-azure-operator/1690101593501863936. Search.ci has other similar failures.
Version-Release number of selected component (if applicable)
I have seen this in recent 4.14 CI job runs. I also found a failure from February 2023, which precedes the 4.13 branch cut in March 2023, which means these failures go back at least to 4.13: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_cluster-ingress-operator/874/pull-ci-openshift-cluster-ingress-operator-master-e2e-azure-operator/1626100610514292736
How reproducible
Presently, search.ci shows the following stats for the past 14 days:
Found in 6.98% of runs (14.29% of failures) across 43 total runs and 1 jobs (48.84% failed) pull-ci-openshift-cluster-ingress-operator-master-e2e-azure-operator (all) - 43 runs, 49% failed, 14% of failures match = 7% impact
Steps to Reproduce
1. Post a PR and have bad luck.
2. Check search.ci using the link above.
Actual results
CI fails.
Expected results
CI passes, or fails on some other test failure.