Loading...

XML

Word

Printable

Type: Bug
Resolution: Cannot Reproduce
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.17.z, 4.16.z
Component/s: Node / CRI-O
Labels:
None

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Moderate
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Priority Data:
PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Pod termination failed due to container storage unmount error (device or resource busy).
The upgrade is stuck due to lower revision pods are in terminating state, resulting new revision being stuck:

~~~
Dec 15 14:00:27 xyz-master-0 kubenswrapper[1963730]: E1215 14:00:27.327483 1963730 pod_workers.go:1298] "Error syncing pod, skipping" err="failed to \"KillPodSandbox\" for \"45abd18873ece2fa0c1a9b927a3b679b\" with KillPodSandboxError: \"rpc error: code = Unknown desc = failed to stop infra container for pod sandbox 851172f8507471421c218857d6e42508b8a8bcc09bae68940f8c275da2befa1f: failed to unmount container 851172f8507471421c218857d6e42508b8a8bcc09bae68940f8c275da2befa1f: removing mount point \\\"/var/lib/containers/storage/overlay/7d826cca019cced93bc33b97a5dc7a46240f0f3ffc75631c88df7508bfcabf3b/merged\\\": device or resource busy\"" pod="openshift-kube-scheduler/openshift-kube-scheduler-xyz-master-0" podUID="45abd18873ece2fa0c1a9b927a3b679b"
~~~

The container storage filesystem was mounted successfully; however, the unmount system call failed during container teardown.
As a result, the underlying overlay filesystem resources could not be released, preventing container cleanup and causing affected containers to remain in the Terminating state. Not sure why the umount failed.

WorkAround: Since the underlying CRI-O storage is affected, the only solution is to clean the CRI-O storage.

I have few customers who faced this issues, one of them is expecting RCA. Since they are facing these issues while upgrading every cluster from 4.16.x to 4.17.x

is related to

OCPBUGS-74694 Pods stuck terminating on cri-o unmounting overlayfs /merged path

links to

Pod termination failed due to container storage unmount error (device or resource busy).

Assignee:: Peter Hunt

Reporter:: Pratik Uplenchwar

Need Info From:: None

Contributors:: None

QA Contact:: None

Doc Contact:: None

Votes:: 1 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2026/01/13 9:55 AM

Updated:: 2026/02/19 2:43 PM

Resolved:: 2026/01/26 5:10 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates