Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 4.16.z
Component/s: RHCOS
Labels:
- coreoswest-triaged
- rhcos-waitingonrhel

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
None

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
Ready to Pick
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Impact Score:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

We noticed on an openshift environment that processes getting near to their memory limit would suddenly start using most of the CPU available on the host in kernel code paths. This would make the host where the process was running unresponsive and cause an outage. When we first investigated this we found this was the stack trace on CPU most of the time. This is described in https://clickhouse.com/blog/a-case-of-the-vanishing-cpu-a-linux-kernel-debugging-story and https://issuetracker.google.com/issues/363324206. This is hit in both cgroupv1 and v2 :

    native_queued_spin_lock_slowpath
    _raw_spin_lock
    __remove_mapping
    shrink_folio_list
    shrink_inactive_list
    shrink_lruvec
    shrink_node_memcgs
    shrink_node
    shrink_zones.constprop.0
    do_try_to_free_pages
    try_to_free_mem_cgroup_pages
    try_charge_memcg
    charge_memcg
    __mem_cgroup_charge
    __filemap_add_folio
    filemap_add_folio
    page_cache_ra_unbounded
    do_sync_mmap_readahead
    filemap_fault
    __do_fault
    do_read_fault
    do_pte_missing
    __handle_mm_fault
    handle_mm_fault
    do_user_addr_fault
    exc_page_fault
    asm_exc_page_fault
    [Missed User Stack]

Version : OpenShift 4.16+

How reproducible:

    Always in customer enviornment

Steps to Reproduce:

Run the following as root. in cgroupv1 the issue will encounter quickly but cgroupv2 it would take some time to surface. Download it here https://github.com/serxa/stress_memcg/releases/download/v1.0.0/stress_memcg_x86-64

$ mkdir -p /root/files
$ systemd-run --scope -p MemoryMax=1G ./stress_memcg_x86-64  1000 1000 3000000000 4000000000 /root/files 30000

After a short while this should start using a lot of %sys cpu time and stack traces will show that it's memory reclaim due to mapped file faults.

Actual results:

Expected results:

Additional info:

Assignee:: Unassigned

Reporter:: Novonil Choudhuri

Need Info From:: None

Contributors:: None

QA Contact:: Michael Nguyen

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2025/06/26 9:37 PM

Updated:: 2025/11/06 7:50 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

Hide