Loading...

XML

Word

Printable

Type: Bug
Resolution: Can't Do
Priority: Critical
Fix Version/s: None
Affects Version/s: 4.15.z
Component/s: Node / Kubelet
Labels:
- inodes
- node
- tmpfs
- triaged
- worker

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Critical
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

A customer is getting a NodeFilesystemAlmostOutOfFiles alert due to inode exhaustion on the /tmp mount.
~~~
NodeFilesystemAlmostOutOfFiles
Aug 4, 2025, 7:06 AM
Filesystem on tmpfs, mounted on /tmp, at ocp-wrk2.ocp.int.teb.com.tr has only 0.56% available inodes left.
~~~

Although files under /tmp (e.g., exec-process-*) are deleted during deployment cleanup, the inode space is not being released, suggesting that processes may still be holding file handles. 

Processes are still holding open file handles to deleted files in /tmp, preventing the inode count from being released. This is leading to inode exhaustion despite the absence of large or visible files.

~~~
$ df -i
Filesystem        Inodes   IUsed     IFree IUse% Mounted on
devtmpfs        98991429    1026  98990403    1% /dev
tmpfs           99003452       2  99003450    1% /dev/shm
tmpfs             819200  393448    425752   49% /run
tmpfs               1024      18      1006    2% /sys/fs/cgroup
/dev/sda4      166953480 7619948 159333532    5% /sysroot
tmpfs            1048576 1043249      5327  100% /tmp
/dev/sda3          98304     324     97980    1% /boot
tmpfs           19800690      14  19800676    1% /run/user/1000
~~~
~~~
# du -sk /tmp/* | sort -n
0       /tmp/systemd-private-517ea969a1204213bc60ff53f560b62b-chronyd.service-ILAovQ
0       /tmp/systemd-private-517ea969a1204213bc60ff53f560b62b-dbus-broker.service-RntUTD
0       /tmp/systemd-private-517ea969a1204213bc60ff53f560b62b-systemd-logind.service-wnJ3Rs
4       /tmp/exec-process-115098596
4       /tmp/exec-process-1335433519
4       /tmp/exec-process-1456886553
4       /tmp/exec-process-1539030326
4       /tmp/exec-process-1919139384
4       /tmp/exec-process-1927989415
4       /tmp/exec-process-2114726635
4       /tmp/exec-process-2398372778
4       /tmp/exec-process-2695446883
4       /tmp/exec-process-3143928529
4       /tmp/exec-process-3492991084
4       /tmp/exec-process-568241004
4       /tmp/exec-process-670442358
~~~  
~~~
# lsof +L1 /tmp
COMMAND       PID USER   FD   TYPE DEVICE SIZE/OFF NLINK   NODE NAME
dbus-brok 3293159 core   12u   REG    0,1  2097152     0 150428 /memfd:dbus-broker-log (deleted)
bash      3457170 core  cwd    DIR   0,45      100     5      1 /tmp
lsof      3483980 core  cwd    DIR   0,45      100     5      1 /tmp
lsof      3483981 core  cwd    DIR   0,45      100     5      1 /tmp
~~~

Version-Release number of selected component (if applicable):

4.15.39

Actual results:

The inode usage remained at 100% (df -i shows /tmp has only 0.53% free inodes).
As a result, the filesystem triggers a high-inode-usage alert.

Expected results:

The inodes should be freed and the NodeFilesystemAlmostOutOfFiles alert should clear.

Additional info:

Kindly investigate and confirm:
- Whether this is a known issue with inode handling in tmpfs under OpenShift/CRI-O.
- If a patch or cleanup mechanism can be introduced to prevent inode leaks from deleted files held by orphaned processes.
- Whether tmpfs size/inode settings should be adjusted at mount time in the base OS or cluster configuration.

Note :
Customers are unable to collect sos-report and getting error :
~~~
# toolbox
Checking if there is a newer version of registry.redhat.io/rhel9/support-tools available...
Spawning a container 'toolbox-root' with image 'registry.redhat.io/rhel9/support-tools'
Detected RUN label in the container image. Using that as the default...
301a31cf1c1941efe5de45d51846e6ddafccd5622605237a24685f63e1cd89fe
Error: unable to start container "301a31cf1c1941efe5de45d51846e6ddafccd5622605237a24685f63e1cd89fe": container create failed (no logs from conmon): conmon bytes "": readObjectStart: expect { or n, but found , error found in #0 byte of ...||..., bigger context ...||...
/bin/toolbox: failed to start container 'toolbox-root'
~~~

Node debug is also failing :
~~~
# oc debug node/ocp-wrk2.ocp.int.teb.com.tr
Temporary namespace openshift-debug-rb985 is created for debugging node...
Starting pod/ocp-wrk2ocpinttebcomtr-debug-n8fcw ...
To use host binaries, run `chroot /host`
warning: Container container-00 is unable to start due to an error: error reading container (probably exited) json message: EOF
^C
Removing debug pod ...
warning: Container container-00 is unable to start due to an error: error reading container (probably exited) json message: EOF
~~~

links to

KCS 7067915: The /tmpfs filesystem has no inode space available on node in RHOCP 4

KCS 7126255: NodeFilesystemAlmostOutOfFiles alert troubleshooting in OpenShift Container Platform 4

Assignee:: Kevin Hannon

Reporter:: Suruchi Dharma

QA Contact:: Min Li

Need Info From:: Kevin Hannon

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2025/08/04 6:13 PM

Updated:: 2026/01/26 5:01 PM

Resolved:: 2026/01/02 5:47 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates