Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36399

metal serial jobs failing tests due to etcd errors

XMLWordPrintable

    • None
    • Rejected
    • False
    • Hide

      None

      Show
      None

      We have a bunch of red squares on metal in component readiness.  It seems to affect many metal jobs, such as  periodic-ci-openshift-release-master-nightly-4.17-e2e-metal-ipi-serial-ovn-dualstack and periodic-ci-openshift-release-master-nightly-4.17-upgrade-from-stable-4.16-e2e-metal-ipi-ovn-upgrade.

      When looking into these jobs, we noticed errors from etcd related to leader election, and historically that's been due to resource contention. Digging into these jobs we notice there are many from the cipool-esicihosts-el9 pool, although it's not all of them. This started around June 25th. 

      See sippy for a list of recent runs https://sippy.dptools.openshift.org/sippy-ng/jobs/4.17/runs?filters=%7B%22items%22%3A%5B%7B%22columnField%22%3A%22job%22%2C%22operatorValue%22%3A%22equals%22%2C%22value%22%3A%22periodic-ci-openshift-release-master-nightly-4.17-e2e-metal-ipi-serial-ovn-dualstack%22%7D%5D%7D

       

      There are a large number of tests affected, we initially investigated these:

      [sig-arch][Early] Managed cluster should [apigroup:config.openshift.io] start all core operators [Skipped:Disconnected] [Suite:openshift/conformance/parallel].
       
      [sig-arch][Early] CRDs for openshift.io should have subresource.status [Suite:openshift/conformance/parallel] 
       
      [sig-arch][Early] APIs for openshift.io must have stable versions [Suite:openshift/conformance/parallel] 

       

            dhiggins@redhat.com Derek Higgins
            stbenjam Stephen Benjamin
            Jad Haj Yahya Jad Haj Yahya
            Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

              Created:
              Updated: