XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: None
Affects Version/s: 4.18
Component/s: Unknown
Labels:

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

CI Failure Analysis Report

Job URL: ocp-spoke-assisted-operator-cim-ui-deploy/4427
Build Number: 4427
Jenkins Agent: ocp-edge09.lab.eng.tlv2.redhat.com

1. Failure Classification

Infrastructure

2. Confidence Level

High (95%)

3. Root Cause Analysis

The CI failure was caused by infrastructure instability and environment setup issues, which prevented the Cypress tests from executing successfully.

Registry/Image Pull Failure: During the environment setup, the glauth pod in the qe-ldap namespace failed with an ImagePullBackOff error. This prevented the LDAP identity provider from starting, leading to subsequent authentication failures (oc login returned 500 Internal Server Error).
Backend Instability: The console logs show multiple 500 Internal Server Error responses for the MCE proxy endpoint:
GET .../api/proxy/plugin/mce/console/multicloud/multiclusterhub/components Status: 500
This suggests that the Multi-Cluster Engine (MCE) backend or the console proxy was unhealthy, preventing the UI from rendering correctly.
Cypress Test Failure: The test OCP-51999: [cim UI] create hypershift cluster with a single node pool + x workers failed with an AssertionError because it could not find the element #form-input-clusterNetworks-0-cidr-field. This field failed to appear because the backend services (MCE) were not providing the necessary data to the UI due to the aforementioned instability.
Reporting Failure: Polarion reporting failed with a PermissionDeniedException because the build attempted to import results into a Test Run (OS-20251204-1055) that was already marked as finished.

4. Breaking PR Identification

No breaking PR was identified. The root cause is attributed to transient infrastructure issues (registry connectivity and backend service health) rather than a specific code change in the product or test suite.

5. Recommended Next Steps

Retry the Build: Given the `ImagePullBackOff` and `500` errors, this is likely a transient environment issue. A retry on a healthy node/cluster should be the first step.

Investigate Registry Connectivity: Check why the `glauth` image could not be pulled and ensure the registry is accessible from the test environment.

Verify MCE Health: Ensure the `multicluster-engine` operator and its components are fully reconciled and healthy on the hub cluster before starting UI tests.

Polarion Test Run Management: If reporting is required, reopen the Test Run `OS-20251204-1055` or configure the CI to use a new, active Test Run.

Assignee:: Unassigned

Reporter:: Udi Greener

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2026/03/03 3:53 PM

Updated:: 2026/03/04 3:43 AM

Resolved:: 2026/03/03 4:04 PM

Details

Description

1. Failure Classification

2. Confidence Level

3. Root Cause Analysis

4. Breaking PR Identification

5. Recommended Next Steps

Retry the Build: Given the ImagePullBackOff and 500 errors, this is likely a transient environment issue. A retry on a healthy node/cluster should be the first step.

Investigate Registry Connectivity: Check why the glauth image could not be pulled and ensure the registry is accessible from the test environment.

Verify MCE Health: Ensure the multicluster-engine operator and its components are fully reconciled and healthy on the hub cluster before starting UI tests.

Polarion Test Run Management: If reporting is required, reopen the Test Run OS-20251204-1055 or configure the CI to use a new, active Test Run.

Attachments

Easy Agile Planning Poker

Activity

People

Dates

Retry the Build: Given the `ImagePullBackOff` and `500` errors, this is likely a transient environment issue. A retry on a healthy node/cluster should be the first step.

Investigate Registry Connectivity: Check why the `glauth` image could not be pulled and ensure the registry is accessible from the test environment.

Verify MCE Health: Ensure the `multicluster-engine` operator and its components are fully reconciled and healthy on the hub cluster before starting UI tests.

Polarion Test Run Management: If reporting is required, reopen the Test Run `OS-20251204-1055` or configure the CI to use a new, active Test Run.