Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-55398

Component Readiness: [Networking / router] [periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade] test regressed due to machine api issues

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • 1
    • Critical
    • None
    • None
    • None
    • Rejected
    • Metal Platform 270
    • 1
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Component Readiness has found a potential regression in the following test:

      periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade 

      I analyzed 5 jobs and all had issues with nodes not coming up, and machine api not working properly (level=error msg=Cluster operator cluster-autoscaler Degraded is True with MissingDependency: machine-api not ready)

      1. https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade/1915807119978795008:
      Two workers were "red" without clear failures, but all three working masters had message:
      "Drain operation currently blocked by: [\{Name:EtcdQuorumOperator Owner:clusteroperator/etcd}]" and CSRs failed to get approved.
       
      2. https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade/1915775736111697920

      One machine failed/red with "Insufficient resources"
      name: worker-user-data-managed
      status: 
        conditions: 
          - lastTransitionTime: "2025-04-25T15:17:33Z"
            status: "True" 
            type: Drainable
          - lastTransitionTime: "2025-04-25T15:17:33Z
            message: Instance has not been created reason: InstanceNotCreated severity: Warning
            status: "False"
            type: InstanceExists 
          - lastTransitionTime: "2025-04-25T15:17:33Z" 
            status: "True" 
            type: Terminable 
        errorMessage: No available BareMetalHost found 
        errorReason: InsufficientResources 
        lastUpdated: "2025-04-25T15:17:33Z" 
        phase: Provisioning

      3. https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade/1915379017452621824
      had node issue with symptoms: CSRs, six failed to get "approved".  Two workers were "red" without clear failures, but all three working masters had same message as example 1:
      {{message: "Drain operation currently blocked by: [\{Name:EtcdQuorumOperator Owner:clusteroperator/etcd}]" }}
      reason: HookPresent 
      severity: Warning 
      status: "False" 
      {{type: Drainable}}

      4.  https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade/1915292957083176960 

      had one of two workers fail, with all masters having the "Drain operation currently blocked" error same as example 1 and 3, and had the CSR issues with 2 CSRs that never got approved.

      5. https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-metal-ipi-ovn-bm-upgrade/1915069439116578816

      was the first job (Apr 23 11:45:12 (UTC-4)  that had the worker failure when masters could not drain, and also unapproved CSR issues like 1, 3, and 4.  on 

      Significant regression detected.
      Fishers Exact probability of a regression: 100.00%.
      Test pass rate dropped from 98.42% to 84.38%.

      Sample (being evaluated) Release: 4.19
      Start Time: 2025-04-18T00:00:00Z
      End Time: 2025-04-25T20:00:00Z
      Success Rate: 84.38%
      Successes: 54
      Failures: 10
      Flakes: 0

       

              rhn-engineering-dtantsur Dmitry Tantsur
              cholman@redhat.com Candace Holman
              None
              None
              Jad Haj Yahya Jad Haj Yahya
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: