Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 4.18
Component/s: kube-apiserver
Labels:
- rits-work

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
None
Architecture:

x86_64

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Based on the issue described here: https://github.com/kubernetes/kubernetes/issues/133442

Setting the priorityClassName field on static pod definitions have no impact on the shutdown order as kubelet ignores the field for static pods. This causes the static pods to be terminated in the first round of terminations rather than at the time specified by the priorityClassName.

This renders the gracefulShutdown order almost useless for Single Node Openshift because we lose kube-apiserver and kube-etcd right away.

I have opened the PRs below to fix this issue. Could we please merge and immediately backport this fix ASAP as our customers are facing very long shutdown times, shutdown hangs, and forced shutdowns impacting the storage layer for SNO environments because on this issue.

https://github.com/openshift/cluster-kube-apiserver-operator/pull/1915
https://github.com/openshift/cluster-etcd-operator/pull/1476
https://github.com/openshift/cluster-kube-controller-manager-operator/pull/865
https://github.com/openshift/cluster-kube-scheduler-operator/pull/572

Describe the impact to you or the business
Long shutdown times and storage layer problems caused by forceful termination from graceful Termination not being respected. 

In what environment are you experiencing this behavior?
All SNO environments

How frequently does this behavior occur? Does it occur repeatedly or at certain times?
Every shutdown

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

    1.
    2.
    3.

Actual results:

Expected results:

Additional info:

depends on

OCPBUGS-64686 [4.21] Backport priorityClassName fix to fix a gracefulShutdown bug

Verified

Assignee:: Vu Dinh

Reporter:: Novonil Choudhuri

Need Info From:: None

Contributors:: None

QA Contact:: Ke Wang

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2025/10/16 3:39 PM

Updated:: 2025/11/20 11:13 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates