Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-65575

Kube apiserver minor upgrades stuck Progressing on multi-arch

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • Approved
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      (Feel free to update this bug's summary to be more specific.)
      Component Readiness has found a potential regression in the following test:

      [sig-cluster-lifecycle] Cluster completes upgrade

      Extreme regression detected.
      Fishers Exact probability of a regression: 100.00%.
      Test pass rate dropped from 100.00% to 75.00%.

      Sample (being evaluated) Release: 4.21
      Start Time: 2025-11-06T00:00:00Z
      End Time: 2025-11-13T12:00:00Z
      Success Rate: 75.00%
      Successes: 9
      Failures: 3
      Flakes: 0
      Base (historical) Release: 4.20
      Start Time: 2025-10-14T00:00:00Z
      End Time: 2025-11-13T12:00:00Z
      Success Rate: 100.00%
      Successes: 68
      Failures: 0
      Flakes: 0

      View the test details report for additional context.

      Also showing on AWS for a similar job

      Seems to have started around Nov 7th and failed every time this week, though these jobs are rarely run so it could have been a few days prior.

      kube-apiserver appears to the be operator stuck Progressing, taken from this job

      source/OperatorProgressing display/true condition/Progressing reason/NodeInstaller status/True NodeInstallerProgressing: 3 nodes are at revision 7; 0 nodes have achieved new revision 8 NodeInstallerProgressing: 3 nodes are at revision 7; 0 nodes have achieved new revision 8 [149m29s]
      

      I don't know if it's related but if you use the link above and scroll down to the KubeEvent section, you'll see spamming events like this:

      source/KubeEvent display/true count/187 firstTimestamp/2025-11-13T03:16:26Z interesting/true lastTimestamp/2025-11-13T03:56:39Z pathological/true reason/FailedCreatePodSandBox Failed to create pod sandbox: the feature gate "UserNamespacesSupport" is disabled: can't set spec.HostUsers [1s]
      

      This looks new to me. I don't see the same thing in a comparable amd64 minor upgrade job like this one

      Appears to only be happening in multi jobs over past two days: https://search.dptools.openshift.org/?search=Failed+to+create+pod+sandbox%3A+the+feature+gate+%22UserNamespacesSupport%22&maxAge=48h&context=1&type=bug%2Bissue%2Bjunit&name=&excludeName=&maxMatches=5&maxBytes=20971520&groupBy=job

      Filed by: dgoodwin@redhat.com

              tvardema Trevor Vardeman
              openshift-trt OpenShift Technical Release Team
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: