-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.16, 4.17
-
Critical
-
Yes
-
2
-
NE Sprint 258
-
1
-
Rejected
-
False
-
-
-
Bug Fix
-
Done
Description of problem:
co/ingress is always good even operator pod log error: 2024-07-24T06:42:09.580Z ERROR operator.canary_controller wait/backoff.go:226 error performing canary route check {"error": "error sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hongli-aws.qe.devcluster.openshift.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
Version-Release number of selected component (if applicable):
4.17.0-0.nightly-2024-07-20-191204
How reproducible:
100%
Steps to Reproduce:
1. install AWS cluster 2. update ingresscontroller/default and adding "endpointPublishingStrategy.loadBalancer.allowedSourceRanges", eg spec: endpointPublishingStrategy: loadBalancer: allowedSourceRanges: - 1.1.1.2/32 3. above setting drop most traffic to LB, so some operator degraded
Actual results:
co/authentication and console degraded but co/ingress is still good $ oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE authentication 4.17.0-0.nightly-2024-07-20-191204 False False True 22m OAuthServerRouteEndpointAccessibleControllerAvailable: Get "https://oauth-openshift.apps.hongli-aws.qe.devcluster.openshift.com/healthz": context deadline exceeded (Client.Timeout exceeded while awaiting headers) console 4.17.0-0.nightly-2024-07-20-191204 False False True 22m RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.hongli-aws.qe.devcluster.openshift.com): Get "https://console-openshift-console.apps.hongli-aws.qe.devcluster.openshift.com": context deadline exceeded (Client.Timeout exceeded while awaiting headers) ingress 4.17.0-0.nightly-2024-07-20-191204 True False False 3h58m check the ingress operator log and see: 2024-07-24T06:59:09.588Z ERROR operator.canary_controller wait/backoff.go:226 error performing canary route check {"error": "error sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hongli-aws.qe.devcluster.openshift.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
Expected results:
co/ingress status should reflect the real condition timely
Additional info:
even co/ingress status can be updated in some scenarios, but it is always less sensitive than authentication and console, we always rely on authentication/console to know the route healthy, the purpose of ingress canary route becomes meaningless.
- blocks
-
OCPBUGS-39220 [Backport-4.17] co/ingress status cannot reflect the real condition
- Closed
- is caused by
-
OCPBUGS-3522 Improve CanaryChecksRepetitiveFailures actionability
- Closed
- is cloned by
-
OCPBUGS-39220 [Backport-4.17] co/ingress status cannot reflect the real condition
- Closed
- is duplicated by
-
OCPBUGS-35071 Canary route checks for the default ingress controller are failing but co/ingress is still available
- Closed
- links to
-
RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update