Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-48733

Unable to access webconsole for hcp cluster.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.18.z
    • HyperShift
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      Unable to access web console for hcp cluster which is used as ACM hub. Below mentioned error messages are displayed.
      
      2025-01-22T11:07:49.870Z ERROR operator.ingress_controller controller/controller.go:116 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:\nerror sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (x83 over 1h22m0s))"} 2025-01-22T11:07:49.870Z INFO operator.ingress_controller controller/controller.go:116 reconciling {"request": {"name":"default","namespace":"openshift-ingress-operator"}} 2025-01-22T11:07:49.889Z INFO operator.status_controller controller/controller.go:116 Reconciling {"request": {"name":"default","namespace":"openshift-ingress-operator"}} 2025-01-22T11:07:49.942Z ERROR operator.ingress_controller controller/controller.go:116 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:\nerror sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (x83 over 1h22m0s))"} 2025-01-22T11:08:45.290Z INFO operator.ingress_controller controller/controller.go:116 reconciling {"request": {"name":"default","namespace":"openshift-ingress-operator"}} 2025-01-22T11:08:45.357Z ERROR operator.ingress_controller controller/controller.go:116 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:\nerror sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (x83 over 1h22m0s))"} 2025-01-22T11:08:49.767Z ERROR operator.canary_controller wait/backoff.go:226 error performing canary route check {"error": "error sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}
      
      All the nodes are up for the hcp cluster---
      oc get nodes
      NAME                            STATUS   ROLES    AGE     VERSION
      hcp418-rack03-hub-zblvr-95g4m   Ready    worker   5d20h   v1.31.3
      hcp418-rack03-hub-zblvr-btnsl   Ready    worker   5d20h   v1.31.3
      hcp418-rack03-hub-zblvr-k2lt2   Ready    worker   5d20h   v1.31.3
      
      >oc get co
      NAME                                       VERSION       AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
      console                                    4.18.0-rc.1   False       False         True       25h     RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com): Get "https://console-openshift-console.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com": EOF
      csi-snapshot-controller                    4.18.0-rc.1   True        False         False      94m
      dns                                        4.18.0-rc.1   True        False         False      86m
      image-registry                             4.18.0-rc.1   True        False         False      86m
      ingress                                    4.18.0-rc.1   True        False         True       86m     The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:...
      insights                                   4.18.0-rc.1   True        False         False      3h24m
      kube-apiserver                             4.18.0-rc.1   True        False         False      12d
      kube-controller-manager                    4.18.0-rc.1   True        False         False      12d
      kube-scheduler                             4.18.0-rc.1   True        False         False      12d
      kube-storage-version-migrator              4.18.0-rc.1   True        False         False      86m
      monitoring                                 4.18.0-rc.1   False       True          True       70m     UpdatingMetricsServer: reconciling MetricsServer Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/metrics-server: context deadline exceeded: got 2 unavailable replicas
      network                                    4.18.0-rc.1   True        False         False      12d
      node-tuning                                4.18.0-rc.1   True        False         False      88m
      openshift-apiserver                        4.18.0-rc.1   True        False         False      12d
      openshift-controller-manager               4.18.0-rc.1   True        False         False      12d
      openshift-samples                          4.18.0-rc.1   True        False         False      12d
      operator-lifecycle-manager                 4.18.0-rc.1   True        False         False      12d
      operator-lifecycle-manager-catalog         4.18.0-rc.1   True        False         False      12d
      operator-lifecycle-manager-packageserver   4.18.0-rc.1   True        False         False      12d
      service-ca                                 4.18.0-rc.1   True        False         False      12d
      storage                                    4.18.0-rc.1   True        False         False      12d
      
      The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI):
      HCI rack with fusion operator provider cluster
      The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc): Provider
       
       
      The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable):
      OCP: 4.18.0-rc.1
      FDF : 4.18.0-100
      rack: rackm03
      hcp cluster: hcp418-rack03-hub
      
      Does this issue impact your ability to continue to work with the product? yes
       
       
      Is there any workaround available to the best of your knowledge?
      Yes creating a new hcp cluster and bringing it up as ACM hub.
       
       
      Can this issue be reproduced? If so, please provide the hit rate
      Yes it has appeared earlier, but can not replicate it 
       
      Can this issue be reproduced from the UI? Yes
      If this is a regression, please provide more details to justify this:
      
      The exact date and time when the issue was observed, including timezone details:
      Observed from Tuesday, 21st Jan 2025 morning
      
      
      Logs collected and log location:
       
      Additional info:
       

      Version-Release number of selected component (if applicable):

          

      How reproducible:

          

      Steps to Reproduce:

          1. Create Provider cluster of HCI fusion rack 
          2. Create a hcp cluster     

      Actual results:

      Unable to access web console for hcp cluster which is used as ACM hub. Below mentioned error messages are displayed.

      Expected results:

          Should be able to access the webconsole for the hcp clusters.

      Additional info:

          Attached must gather: http://rhsqe-repo.lab.eng.blr.redhat.com/ocs4qe/OCPBUGS-48733/

              Unassigned Unassigned
              ammahapa@redhat.com Amrita Mahapatra
              None
              None
              Liangquan Li Liangquan Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated: