Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.19.z
Component/s: HyperShift
Labels:
- triaged

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

    Unreachable nodes with restrictive PodDisruptionBudgets cannot be drained and remain stuck in deletion.

Version-Release number of selected component (if applicable):

    4.19.z

How reproducible:

    100%

Steps to Reproduce:

    1. Create a ROSA HCP cluster with its default NodePool (2 workers)
    2. Stop both workers in the AWS console
    3. These nodes are marked as unhealthy by MHC and stuck deleting permanently due to PDBs.

Actual results:

    Worker nodes can't be replaced by MHC as they are stuck deleting permanently

Expected results:

    Worker nodes should successfuly be replaced by MHC

Additional info:

    This seems to be a CAPI bug that I believe was introduced with the drain refactor in CAPI commit 3232abcf3 where CAPI moved from using kubectl's drain logic to its own implementation.

Must-gather attached in comments (contains HCP dump, dataplane dump).

Assignee:: Salvatore Dario Minonne

Reporter:: Claudio Busse

Need Info From:: None

Contributors:: None

QA Contact:: Ying Zhou

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2025/09/24 1:05 PM

Updated:: 2025/09/29 2:22 PM