Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.18.z
Component/s: HyperShift
Labels:
None

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Critical
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Unable to access web console for hcp cluster which is used as ACM hub. Below mentioned error messages are displayed.

2025-01-22T11:07:49.870Z ERROR operator.ingress_controller controller/controller.go:116 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:\nerror sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (x83 over 1h22m0s))"} 2025-01-22T11:07:49.870Z INFO operator.ingress_controller controller/controller.go:116 reconciling {"request": {"name":"default","namespace":"openshift-ingress-operator"}} 2025-01-22T11:07:49.889Z INFO operator.status_controller controller/controller.go:116 Reconciling {"request": {"name":"default","namespace":"openshift-ingress-operator"}} 2025-01-22T11:07:49.942Z ERROR operator.ingress_controller controller/controller.go:116 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:\nerror sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (x83 over 1h22m0s))"} 2025-01-22T11:08:45.290Z INFO operator.ingress_controller controller/controller.go:116 reconciling {"request": {"name":"default","namespace":"openshift-ingress-operator"}} 2025-01-22T11:08:45.357Z ERROR operator.ingress_controller controller/controller.go:116 got retryable error; requeueing {"after": "1m0s", "error": "IngressController is degraded: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:\nerror sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers) (x83 over 1h22m0s))"} 2025-01-22T11:08:49.767Z ERROR operator.canary_controller wait/backoff.go:226 error performing canary route check {"error": "error sending canary HTTP Request: Timeout: Get \"https://canary-openshift-ingress-canary.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"}

All the nodes are up for the hcp cluster---
oc get nodes
NAME                            STATUS   ROLES    AGE     VERSION
hcp418-rack03-hub-zblvr-95g4m   Ready    worker   5d20h   v1.31.3
hcp418-rack03-hub-zblvr-btnsl   Ready    worker   5d20h   v1.31.3
hcp418-rack03-hub-zblvr-k2lt2   Ready    worker   5d20h   v1.31.3

>oc get co
NAME                                       VERSION       AVAILABLE   PROGRESSING   DEGRADED   SINCE   MESSAGE
console                                    4.18.0-rc.1   False       False         True       25h     RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com): Get "https://console-openshift-console.apps.hcp418-rack03-hub.apps.rackm03.mydomain.com": EOF
csi-snapshot-controller                    4.18.0-rc.1   True        False         False      94m
dns                                        4.18.0-rc.1   True        False         False      86m
image-registry                             4.18.0-rc.1   True        False         False      86m
ingress                                    4.18.0-rc.1   True        False         True       86m     The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 1 error messages:...
insights                                   4.18.0-rc.1   True        False         False      3h24m
kube-apiserver                             4.18.0-rc.1   True        False         False      12d
kube-controller-manager                    4.18.0-rc.1   True        False         False      12d
kube-scheduler                             4.18.0-rc.1   True        False         False      12d
kube-storage-version-migrator              4.18.0-rc.1   True        False         False      86m
monitoring                                 4.18.0-rc.1   False       True          True       70m     UpdatingMetricsServer: reconciling MetricsServer Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/metrics-server: context deadline exceeded: got 2 unavailable replicas
network                                    4.18.0-rc.1   True        False         False      12d
node-tuning                                4.18.0-rc.1   True        False         False      88m
openshift-apiserver                        4.18.0-rc.1   True        False         False      12d
openshift-controller-manager               4.18.0-rc.1   True        False         False      12d
openshift-samples                          4.18.0-rc.1   True        False         False      12d
operator-lifecycle-manager                 4.18.0-rc.1   True        False         False      12d
operator-lifecycle-manager-catalog         4.18.0-rc.1   True        False         False      12d
operator-lifecycle-manager-packageserver   4.18.0-rc.1   True        False         False      12d
service-ca                                 4.18.0-rc.1   True        False         False      12d
storage                                    4.18.0-rc.1   True        False         False      12d

The OCP platform infrastructure and deployment type (AWS, Bare Metal, VMware, etc. Please clarify if it is platform agnostic deployment), (IPI/UPI):
HCI rack with fusion operator provider cluster
The ODF deployment type (Internal, External, Internal-Attached (LSO), Multicluster, DR, Provider, etc): Provider
 
 
The version of all relevant components (OCP, ODF, RHCS, ACM whichever is applicable):
OCP: 4.18.0-rc.1
FDF : 4.18.0-100
rack: rackm03
hcp cluster: hcp418-rack03-hub

Does this issue impact your ability to continue to work with the product? yes
 
 
Is there any workaround available to the best of your knowledge?
Yes creating a new hcp cluster and bringing it up as ACM hub.
 
 
Can this issue be reproduced? If so, please provide the hit rate
Yes it has appeared earlier, but can not replicate it 
 
Can this issue be reproduced from the UI? Yes
If this is a regression, please provide more details to justify this:

The exact date and time when the issue was observed, including timezone details:
Observed from Tuesday, 21st Jan 2025 morning


Logs collected and log location:
 
Additional info:

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

    1. Create Provider cluster of HCI fusion rack 
    2. Create a hcp cluster

Actual results:

Unable to access web console for hcp cluster which is used as ACM hub. Below mentioned error messages are displayed.

Expected results:

    Should be able to access the webconsole for the hcp clusters.

Additional info:

    Attached must gather: http://rhsqe-repo.lab.eng.blr.redhat.com/ocs4qe/OCPBUGS-48733/

Assignee:: Unassigned

Reporter:: Amrita Mahapatra

Need Info From:: None

Contributors:: None

QA Contact:: Liangquan Li

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 13 Start watching this issue

Created:: 2025/01/22 12:21 PM

Updated:: 2025/07/16 1:43 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

Hide