Loading...

XML

Word

Printable

Details

Type: Bug
Resolution: Can't Do
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.11
Component/s: Node / Kubelet
Labels:
- blue
- triaged

Severity:
Moderate
Sprint:
OCPNODE Sprint 230 (Blue), OCPNODE Sprint 233 (Blue)
sprint_count:
2
Blocked:
False
Blocked Reason:

Hide

None

Show
None

SFDC Cases Counter:
SFDC Cases Links:

Description

Description of problem:

A 4.11.20 to 4.11.21 update was progressing happily until the machine-config operator got to rolling the control plane nodes. Draining master-0 was slowed by an installer-9-...-master-0 pod in openshift-etcd that was nominally still Terminating despite being 15 days old, and library-go installer-... pods usually making quick work of installing static-pod assets and then exiting. Manually deleting the pod unstuck the drain, and the update proceeded to complete successfully without further excitement.

Version-Release number of selected component (if applicable):

The cluster was transitioning from 4.11.20 to 4.11.21. At the time of the issue, it was still 4.11.20 kubelet and CRI-O components running on the outgoing node, although both 4.11.20 and 4.11.21 have the same RHCOS 411.86.202212072103-0 anyway.

How reproducible:

Unknown. The other two control-plane nodes in this cluster drained without incident, so expected reproducibility is low.

Steps to Reproduce:

Unknown.

Actual results:

Node failed to drain, with a pod stuck in Terminating despite having no backing container.

Expected results:

Successful drain, with pods only being reported as Terminating when they have associated containers that are also terminating.

Attachments

Activity

People

Assignee:: Harshal Patil

Reporter:: W. Trevor King

QA Contact:: Sunil Choudhary

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 2023/01/12 2:43 AM

Updated:: 2023/03/16 3:28 PM

Resolved:: 2023/03/16 3:28 PM