-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.18.z
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
Critical
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Unable to access web console for hcp cluster which is used as ACM hub. Below mentioned error messages are displayed. 2025-01-22T11:07:49.870Z ERROR operator.ingress_controller controller/controller.go:116 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:\nerror sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (x83 over 1h22m0s))"} 2025-01-22T11:07:49.870Z INFO operator.ingress_controller controller/controller.go:116 reconciling {"request": {"name":"default","namespace":"openshift-ingress-operator"}} 2025-01-22T11:07:49.889Z INFO operator.status_controller controller/controller.go:116 Reconciling {"request": {"name":"default","namespace":"openshift-ingress-operator"}} 2025-01-22T11:07:49.942Z ERROR operator.ingress_controller controller/controller.go:116 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:\nerror sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (x83 over 1h22m0s))"} 2025-01-22T11:08:45.290Z INFO operator.ingress_controller controller/controller.go:116 reconciling {"request": {"name":"default","namespace":"openshift-ingress-operator"}} 2025-01-22T11:08:45.357Z ERROR operator.ingress_controller controller/controller.go:116 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:\nerror sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (x83 over 1h22m0s))"} 2025-01-22T11:08:49.767Z ERROR operator.canary_controller wait/backoff.go:226 error performing canary route check {"error": "error sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"} All the nodes are up for the hcp cluster--- oc get nodes NAME STATUS ROLES AGE VERSION hcp418-rack03-hub-zblvr-95g4m Ready worker 5d20h v1.31.3 hcp418-rack03-hub-zblvr-btnsl Ready worker 5d20h v1.31.3 hcp418-rack03-hub-zblvr-k2lt2 Ready worker 5d20h v1.31.3 >oc get co NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE console 4.18.0-rc.1 False False True 25h RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com): Get "https://console-openshift-console.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com": EOF csi-snapshot-controller 4.18.0-rc.1 True False False 94m dns 4.18.0-rc.1 True False False 86m image-registry 4.18.0-rc.1 True False False 86m ingress 4.18.0-rc.1 True False True 86m The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:... insights 4.18.0-rc.1 True False False 3h24m kube-apiserver 4.18.0-rc.1 True False False 12d kube-controller-manager 4.18.0-rc.1 True False False 12d kube-scheduler 4.18.0-rc.1 True False False 12d kube-storage-version-migrator 4.18.0-rc.1 True False False 86m monitoring 4.18.0-rc.1 False True True 70m UpdatingMetricsServer: reconciling MetricsServer Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/metrics-server: context deadline exceeded: got 2 unavailable replicas network 4.18.0-rc.1 True False False 12d node-tuning 4.18.0-rc.1 True False False 88m openshift-apiserver 4.18.0-rc.1 True False False 12d openshift-controller-manager 4.18.0-rc.1 True False False 12d openshift-samples 4.18.0-rc.1 True False False 12d operator-lifecycle-manager 4.18.0-rc.1 True False False 12d operator-lifecycle-manager-catalog 4.18.0-rc.1 True False False 12d operator-lifecycle-manager-packageserver 4.18.0-rc.1 True False False 12d service-ca 4.18.0-rc.1 True False False 12d storage 4.18.0-rc.1 True False False 12d The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI): HCI rack with fusion operator provider cluster The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc): Provider The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable): OCP: 4.18.0-rc.1 FDF : 4.18.0-100 rack: rackm03 hcp cluster: hcp418-rack03-hub Does this issue impact your ability to continue to work with the product? yes Is there any workaround available to the best of your knowledge? Yes creating a new hcp cluster and bringing it up as ACM hub. Can this issue be reproduced? If so, please provide the hit rate Yes it has appeared earlier, but can not replicate it Can this issue be reproduced from the UI? Yes If this is a regression, please provide more details to justify this: The exact date and time when the issue was observed, including timezone details: Observed from Tuesday, 21st Jan 2025 morning Logs collected and log location: Additional info:
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. Create Provider cluster of HCI fusion rack 2. Create a hcp cluster
Actual results:
Unable to access web console for hcp cluster which is used as ACM hub. Below mentioned error messages are displayed.
Expected results:
Should be able to access the webconsole for the hcp clusters.
Additional info:
Attached must gather: http://rhsqe-repo.lab.eng.blr.redhat.com/ocs4qe/OCPBUGS-48733/