Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-77666

[CI Failure] ocp-spoke-assisted-operator-cim-ui-deploy - Build #4427 - Infrastructure

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      CI Failure Analysis Report

      Job URL: ocp-spoke-assisted-operator-cim-ui-deploy/4427
      Build Number: 4427
      Jenkins Agent: ocp-edge09.lab.eng.tlv2.redhat.com

      1. Failure Classification

      Infrastructure

      2. Confidence Level

      High (95%)

      3. Root Cause Analysis

      The CI failure was caused by infrastructure instability and environment setup issues, which prevented the Cypress tests from executing successfully.

      • Registry/Image Pull Failure: During the environment setup, the glauth pod in the qe-ldap namespace failed with an ImagePullBackOff error. This prevented the LDAP identity provider from starting, leading to subsequent authentication failures (oc login returned 500 Internal Server Error).
      • Backend Instability: The console logs show multiple 500 Internal Server Error responses for the MCE proxy endpoint:
        GET .../api/proxy/plugin/mce/console/multicloud/multiclusterhub/components Status: 500
        This suggests that the Multi-Cluster Engine (MCE) backend or the console proxy was unhealthy, preventing the UI from rendering correctly.
      • Cypress Test Failure: The test OCP-51999: [cim UI] create hypershift cluster with a single node pool + x workers failed with an AssertionError because it could not find the element #form-input-clusterNetworks-0-cidr-field. This field failed to appear because the backend services (MCE) were not providing the necessary data to the UI due to the aforementioned instability.
      • Reporting Failure: Polarion reporting failed with a PermissionDeniedException because the build attempted to import results into a Test Run (OS-20251204-1055) that was already marked as finished.

      4. Breaking PR Identification

      No breaking PR was identified. The root cause is attributed to transient infrastructure issues (registry connectivity and backend service health) rather than a specific code change in the product or test suite.

      5. Recommended Next Steps

      Retry the Build: Given the ImagePullBackOff and 500 errors, this is likely a transient environment issue. A retry on a healthy node/cluster should be the first step.

      Investigate Registry Connectivity: Check why the glauth image could not be pulled and ensure the registry is accessible from the test environment.

      Verify MCE Health: Ensure the multicluster-engine operator and its components are fully reconciled and healthy on the hub cluster before starting UI tests.

      Polarion Test Run Management: If reporting is required, reopen the Test Run OS-20251204-1055 or configure the CI to use a new, active Test Run.

              Unassigned Unassigned
              rh-ee-ugreener Udi Greener
              None
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: