Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-30044

The "Extra worker" on virtualmedia job is not working correctly.

    XMLWordPrintable

Details

    • No
    • False
    • Hide

      None

      Show
      None

    Description

      If you look at all the failures on this page you will notice that there is a problem with the "extra worker" required for the virtualmedia tests.  One way to see this is by clicking on the camgi link in the prowjob (see attached screenshot if you aren't familiar with this tool).

      When the MCO team investigated they could tell that the machine had indeed successfully joined the cluster for some period of time.  You can see "message: Kubelet stopped posting node status" if you click on the extraworker node in camgi.  As best they can tell this is an infrastructure problem. 

       

      This problem is causing quite a bit of toil understanding the CI signal for autoscaling on the metal platform.  We need assistance from the metal team to improve this or help find other teams to involve in the debugging process.

       

      Everything below this line is the details from Component Readiness:
      -----------------------
      Component Readiness has found a potential regression in [sig-cluster-lifecycle][Feature:Machines][Serial] Managed cluster should grow and decrease when scaling different machineSets simultaneously [Timeout:30m][apigroup:machine.openshift.io] [Suite:openshift/conformance/serial].

      Probability of significant regression: 99.62%

      Sample (being evaluated) Release: 4.15
      Start Time: 2024-02-22T00:00:00Z
      End Time: 2024-02-28T23:59:59Z
      Success Rate: 80.00%
      Successes: 16
      Failures: 4
      Flakes: 0

      Base (historical) Release: 4.14
      Start Time: 2023-10-04T00:00:00Z
      End Time: 2023-10-31T23:59:59Z
      Success Rate: 98.35%
      Successes: 119
      Failures: 2
      Flakes: 0

      View the test details report at https://sippy.dptools.openshift.org/sippy-ng/component_readiness/test_details?arch=amd64&arch=amd64&baseEndTime=2023-10-31%2023%3A59%3A59&baseRelease=4.14&baseStartTime=2023-10-04%2000%3A00%3A00&capability=Machines&component=Cloud%20Compute%20%2F%20Cluster%20Autoscaler&confidence=95&environment=ovn%20no-upgrade%20amd64%20metal-ipi%20serial&excludeArches=arm64%2Cheterogeneous%2Cppc64le%2Cs390x&excludeClouds=openstack%2Cibmcloud%2Clibvirt%2Covirt%2Cunknown&excludeVariants=hypershift%2Cosd%2Cmicroshift%2Ctechpreview%2Csingle-node%2Cassisted%2Ccompact&groupBy=cloud%2Carch%2Cnetwork&ignoreDisruption=true&ignoreMissing=false&minFail=3&network=ovn&network=ovn&pity=5&platform=metal-ipi&platform=metal-ipi&sampleEndTime=2024-02-28%2023%3A59%3A59&sampleRelease=4.15&sampleStartTime=2024-02-22%2000%3A00%3A00&testId=openshift-tests%3A9f3fb60052539c29ab66564689f616ce&testName=%5Bsig-cluster-lifecycle%5D%5BFeature%3AMachines%5D%5BSerial%5D%20Managed%20cluster%20should%20grow%20and%20decrease%20when%20scaling%20different%20machineSets%20simultaneously%20%5BTimeout%3A30m%5D%5Bapigroup%3Amachine.openshift.io%5D%20%5BSuite%3Aopenshift%2Fconformance%2Fserial%5D&upgrade=no-upgrade&upgrade=no-upgrade&variant=serial&variant=serial

      Attachments

        Issue Links

          Activity

            People

              trking W. Trevor King
              rh-ee-bleanhar Brenton Leanhardt
              Jad Haj Yahya Jad Haj Yahya
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: