Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-76353

kube-apiserver pods fail to terminate gracefully on all control plane nodes during upgrades in 4.22

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem

      The test [sig-api-machinery][Feature:APIServer][Late] kubelet terminates kube-apiserver gracefully extended is failing at 100% rate starting 2026-02-06 across multiple platforms (AWS, GCP, ROSA), upgrade types (micro, minor, none), and architectures (amd64, arm64).

      All 3 control plane kube-apiserver pods fail to terminate gracefully during upgrades. The error originates from graceful_termination.go:89 in the openshift/origin test suite.

      Note that this was a block of 10 jobs all running at the same time. This may indicate a problem with the LB in aws at this time? However we have two other regressions for this test that just aren't happening in 4.21, so something is definitely going on here. I will link the related bugs in jira shortly.

      Version-Release number of selected component

      4.22

      Actual results

      All 3 control plane kube-apiserver pods fail to terminate gracefully. Error message:

      fail [github.com/openshift/origin/test/extended/apiserver/graceful_termination.go:89]:
      The following API Servers weren't gracefully terminated:
        kube-apiserver on node <node> wasn't gracefully terminated, reason:
        Previous pod kube-apiserver-<node> started at <time> did not terminate gracefully
      

      This occurs on all 3 master nodes in every run (10/10 runs failed).

      Expected results

      All kube-apiserver pods should terminate gracefully during upgrades, as they did consistently in all runs prior to 2026-02-06.

      Additional info

      Regression Start Analysis:

      • Last passing batch: 2026-02-05 20:08 UTC
      • First failing batch: 2026-02-06 09:04 UTC
      • The regression started within this ~13 hour window

      Failure Output Analysis:

      • 10/10 outputs analyzed: Highly Consistent (100%)
      • All failures show the identical error: kube-apiserver pods did not terminate gracefully
      • Affected AWS regions: us-west-1, us-west-2, us-east-1, us-east-2 (not region-specific)

      Related Component Readiness Regressions (same test, different variants):

      • 35198: aws/micro/amd64
      • 35199: gcp/micro/amd64
      • 35197: gcp/minor/amd64
      • 34194: aws/micro/amd64
      • 34168: aws/micro/arm64
      • 33834: aws/none/amd64
      • 34775: aws/none/amd64
      • 33639: rosa/none/amd64
      • 32860: aws/micro/amd64

      Test Details: Sippy Test Details

              team-mco Team MCO
              rhn-engineering-dgoodwin Devan Goodwin
              None
              None
              Ke Wang Ke Wang
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: