Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: rhel-8.5.0
Component/s: kernel-rt / Core-Kernel
Labels:
- MigratedToJIRA
- Triaged

Regression:
None
Severity:
Moderate
sprint_count:
7

Pool Team:

rhel-sst-kernel-rts
Sub-System Group:

ssg_core_kernel

Story Points:
13
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Product Documentation Required:
None
Sprint:
CK-May-2024, CK-June-2024, CK-July-2024, CK-August-2024, CK-September-2024, CK-October-2024, CK-November-2024

Preliminary Testing:
None
Test Coverage:
None

Release Note Type:
If docs needed, set a value

Experience:
Architecture:

Unspecified
Bugzilla Bug:
RHBZ: 2061524

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Planning:
None

Description of problem:

When CPU isolation is used, such as via the cpu-partitioning tuned profile, it is possible for isolated CPUs to be interrupted via kernel IPIs initiated by non-isolated CPUs. There are many different ways that this can happen but a few have been diagnosed using the rt-trace-bpf tool:

caused by NetworkManager:
64359.052209596 NetworkManager 0 1405 smp_call_function_many_cond (cpu=0, func=do_kernel_range_flush)
        smp_call_function_many_cond+0x1
        smp_call_function+0x39
        on_each_cpu+0x2a
        flush_tlb_kernel_range+0x7b
        __purge_vmap_area_lazy+0x70
        _vm_unmap_aliases.part.42+0xdf
        change_page_attr_set_clr+0x16a
        set_memory_ro+0x26
        bpf_int_jit_compile+0x2f9
        bpf_prog_select_runtime+0xc6
        bpf_prepare_filter+0x523
        sk_attach_filter+0x13
        sock_setsockopt+0x92c
        __sys_setsockopt+0x16a
        __x64_sys_setsockopt+0x20
        do_syscall_64+0x87
        entry_SYSCALL_64_after_hwframe+0x65

caused by the mgag200 kernel module:
238903.096535737 kworker/0:1 0 88579 smp_call_function_many_cond (cpu=0, func=do_flush_tlb_all)
smp_call_function_many_cond+0x1
smp_call_function+0x39
on_each_cpu+0x2a
flush_tlb_kernel_range+0x48
__purge_vmap_area_lazy+0x70
free_vmap_area_noflush+0xf2
remove_vm_area+0x93
__vunmap+0x59
drm_gem_shmem_vunmap+0x6d
mgag200_handle_damage+0x62
mgag200_simple_display_pipe_update+0x69
drm_atomic_helper_commit_planes+0xb3
drm_atomic_helper_commit_tail+0x26
commit_tail+0xc6
drm_atomic_helper_commit+0x103
drm_atomic_helper_dirtyfb+0x20e
drm_fb_helper_damage_work+0x228
process_one_work+0x18f
worker_thread+0x30
kthread+0x15d
ret_from_fork+0x1f

Tracing on the isolated CPUs shows preemptions such as this:

58118.769286 | 18) <...>-128143 | | smp_call_function_interrupt() {
58118.769286 | 18) <...>-128143 | | irq_enter()

{ 58118.769287 | 18) <...>-128143 | 0.101 us | preempt_count_add(); 58118.769288 | 18) <...>-128143 | 0.968 us | }

58118.769288 | 18) <...>-128143 | | generic_smp_call_function_single_interrupt() {
58118.769289 | 18) <...>-128143 | | flush_smp_call_function_queue() {
58118.769289 | 18) <...>-128143 | | do_flush_tlb_all()

{ 58118.769290 | 18) <...>-128143 | 0.453 us | native_flush_tlb_global(); 58118.769291 | 18) <...>-128143 | 1.439 us | }

58118.769292 | 18) <...>-128143 | 2.402 us | }
58118.769292 | 18) <...>-128143 | 3.223 us | }
58118.769292 | 18) <...>-128143 | | irq_exit() {
58118.769293 | 18) <...>-128143 | 0.077 us | preempt_count_sub();
58118.769294 | 18) <...>-128143 | 0.201 us | idle_cpu();
58118.769295 | 18) <...>-128143 | | tick_nohz_irq_exit() {
58118.769295 | 18) <...>-128143 | 0.164 us | ktime_get();
58118.769296 | 18) <...>-128143 | | __tick_nohz_full_update_tick()

{ 58118.769296 | 18) <...>-128143 | 0.079 us | check_tick_dependency(); 58118.769297 | 18) <...>-128143 | 0.074 us | check_tick_dependency(); 58118.769298 | 18) <...>-128143 | 0.070 us | check_tick_dependency(); 58118.769299 | 18) <...>-128143 | 0.101 us | check_tick_dependency(); 58118.769300 | 18) <...>-128143 | 1.458 us | tick_nohz_next_event(); 58118.769302 | 18) <...>-128143 | 0.082 us | tick_nohz_stop_tick(); 58118.769303 | 18) <...>-128143 | 6.229 us | }

58118.769303 | 18) <...>-128143 | 8.124 us | }
58118.769303 | 18) <...>-128143 | + 10.872 us | }
58118.769304 | 18) <...>-128143 | + 17.471 us | }

Version-Release number of selected component (if applicable):

4.18.0-348.12.2.rt7.143.el8_5.x86_64

How reproducible:

Easily

Steps to Reproduce:
1. Boot the system using an RT kernel and the cpu-partitioning tuned profile
2. Run a workload that measures latency, such as oslat, on the isolated CPUs
3. Trace the kernel activity on the isolated CPUs while the workload is running

Actual results:

Latency spikes caused by IPI processing will be observed on the isolated CPUs when there is no need to handle the IPI at that moment.

Expected results:

No needless IPI processing should occur on the isolated CPUs – for example, for a 100% userspace workload such as oslat there is no need to enter the kernel and service the IPI until a necessary kernel entry occurs (ie. system call, timer interrupt, etc.).

Additional info:

external trackers

Red Hat Issue Tracker RHELPLAN-114746

Assignee:: Valentin Schneider

Reporter:: Karl Rister (Inactive)

Developer:: Valentin Schneider

QA Contact:: Qiao Zhao

Votes:: 0 Vote for this issue

Watchers:: 14 Start watching this issue

Created:: 2023/09/25 6:13 PM

Updated:: 2024/11/19 4:01 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates