Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-51272

[kube-apiserver] [operator-conditions] test regressed due to control plane machine set operator jobs

    • None
    • False
    • Hide

      None

      Show
      None
    • Release Note Not Required
    • In Progress

      (Feel free to update this bug's summary to be more specific.)
      Component Readiness has found a potential regression in the following test:

      operator conditions kube-apiserver

      Significant regression detected.
      Fishers Exact probability of a regression: 99.96%.
      Test pass rate dropped from 97.09% to 91.78%.

      Sample (being evaluated) Release: 4.19
      Start Time: 2025-02-18T00:00:00Z
      End Time: 2025-02-25T12:00:00Z
      Success Rate: 91.78%
      Successes: 67
      Failures: 6
      Flakes: 0

      Base (historical) Release: 4.18
      Start Time: 2025-01-26T00:00:00Z
      End Time: 2025-02-25T12:00:00Z
      Success Rate: 97.09%
      Successes: 334
      Failures: 10
      Flakes: 0

      View the test details report for additional context.

      The problem involved may exist in 4.18 and only be appearing in 4.19 because machine set operator jobs are lumped into a larger set, which has shrunk in 4.19. However, there appears to be a common test failure causing this which looks important to the functionality of the job and should be fixed, in addition to the need to get this cell red.

      The test this always seems to fail on is:

      E2E Suite: [It] ControlPlaneMachineSet Operator With an active ControlPlaneMachineSet and the instance type is changed should perform a rolling update [Periodic]

            [OCPBUGS-51272] [kube-apiserver] [operator-conditions] test regressed due to control plane machine set operator jobs

            Manually moving to VERIFIED based on https://issues.redhat.com/browse/OCPBUGS-50587

            Bryce Palmer added a comment - Manually moving to VERIFIED based on https://issues.redhat.com/browse/OCPBUGS-50587

            Hi rh-ee-bpalmer,

            Bugs should not be moved to Verified without first providing a Release Note Type("Bug Fix" or "No Doc Update") and for type "Bug Fix" the Release Note Text must also be provided. Please populate the necessary fields before moving the Bug to Verified.

            OpenShift Jira Bot added a comment - Hi rh-ee-bpalmer , Bugs should not be moved to Verified without first providing a Release Note Type("Bug Fix" or "No Doc Update") and for type "Bug Fix" the Release Note Text must also be provided. Please populate the necessary fields before moving the Bug to Verified.

            Ben Luddy added a comment -

            The issue joelspeed mentions (https://issues.redhat.com/browse/OCPBUGS-50587) affects all static pod operators in the same way. I've handed that off now but I would expect it to be fixed in all static pod operators as part of a single effort.

            rh-ee-bpalmer could you wrangle these Jira issues together please?

            Ben Luddy added a comment - The issue joelspeed mentions ( https://issues.redhat.com/browse/OCPBUGS-50587 ) affects all static pod operators in the same way. I've handed that off now but I would expect it to be fixed in all static pod operators as part of a single effort. rh-ee-bpalmer could you wrangle these Jira issues together please?

            Joel Speed added a comment -

            I believe this is related to changes in the static pod operator library recently

            StaticPodsAvailable: 4 nodes are active; 4 nodes are at revision 10; 0 nodes have achieved new revision 11
            

            This is how the kube-apiserver CO is reporting, but, there's actually only three nodes and three machines.

            It's blocked on moving to revision 11 because it created the installer pod on a node that no longer exists.

            CC bluddy who I believe is working on fixing this issue

            Joel Speed added a comment - I believe this is related to changes in the static pod operator library recently StaticPodsAvailable: 4 nodes are active; 4 nodes are at revision 10; 0 nodes have achieved new revision 11 This is how the kube-apiserver CO is reporting, but, there's actually only three nodes and three machines. It's blocked on moving to revision 11 because it created the installer pod on a node that no longer exists. CC bluddy who I believe is working on fixing this issue

            The linked test details report above has all the job links. Thanks for looking so quickly!

            Devan Goodwin added a comment - The linked test details report above has all the job links. Thanks for looking so quickly!

              rh-ee-bpalmer Bryce Palmer
              rhn-engineering-dgoodwin Devan Goodwin
              Ke Wang Ke Wang
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated: