Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: None
Affects Version/s: 4.19.z
Component/s: Node / Kubelet
Labels:
- cee.neXT

Activity Type:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
Node Green Sprint 280, OCP Node Core Sprint 282, OCP Node Core Sprint 283, OCP Node Core Sprint 284
sprint_count:
4

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

I am investigating an odd issue with kubelet that seems to have been introduced in v4.19 between specific z-streams (v4.19.14 --> v4.19.18). The issue is affecting only BareMetal nodes as it seems with huge capacity of 120 + CPU and lots of RAM. The issue is that whenever the customer deploys ~ 700 pods simultaneously the kubelet trying to mount 2500 + secrets/configmaps simultaneously which causes huge CPU load that makes the node unusable. At first we though this was an issue with kernel but kernel collab shows that there is probably some change introduced that is causing huge amount of processes to stay on D state causing CPU saturation and node being unrensponsive. Kernel comment in the bottom line is the below:

This shows that the pods together account for 2345 mounts (mostly tmpfs secret/projected volumes), which is a primary factor inducing the shrinker_rwsem contention. With hundreds of pods and their thousands of tmpfs mounts, it is quite natural that shrinker_rwsem becomes a hot contention point. The issue is more likely a workload and scaling problem in the OCP environment, rather than a kernel bug.

The version of kubelet that was updated between these 2 z-streams is:

openshift-kubelet 4.19.0-202509122308.p2.g335be3a.assembly.stream.el9 → 4.19.0-202510101528.p2.gf94ad89.assembly.stream.el9

Important Notes:

We were able to mitigate the issue by downgrading these workers CoreOS image to v4.19.14 with a MachineConfig.
Extensive analysis of the issue from kernel team is in the attached case as well as vmcores and sosreports from these nodes.
The same issue is not visible to VM nodes that are also part of the cluster. But not sure about their capacity or deployment volume on these ones as of now. But i can ask if required.

Version-Release number of selected component (if applicable):

4.19.0-202510101528.p2.gf94ad89.assembly.stream.el9

How reproducible:

- Upgrade to v4.19.18
- Deploy ~ 700 pods simultaneously on the node
- See the node load rising until it gets completely unresponsive.

Actual results:

- The node becomes completely unresponsive

Expected results:

- The node should not become unresponsive

Additional info:

is related to

OCPBUGS-64621 Containerized OpenShift components showing significant CPU spikes on systems with high core count

OCPBUGS-61881 Reduced process performance and stability when running on high-core count hardware

Release Pending

links to

[KCS] Bare-metal nodes in the RHOCP cluster start to hang after upgrading from v4.19.14 to v4.19.18

cri-o/cri-o#9661: [1.32] container: filter thermal_throttle masked paths

cri-o/cri-o/#9425: Remove thermal_throttle masked paths

Assignee:: Sascha Grunert

Reporter:: Nikolaos Stamatelopoulos

Contributors:: Qiujie Li

QA Contact:: Min Li

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Created:: 2025/12/02 10:10 AM

Updated:: 2026/03/05 9:04 PM

Resolved:: 2026/02/25 1:12 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates