-
Bug
-
Resolution: Unresolved
-
Major
-
rhel-9.2.0
-
kernel-5.14.0-512.el9
-
None
-
Important
-
2
-
rhel-sst-networking-core
-
ssg_networking
-
11
-
17
-
5
-
-
False
-
-
None
-
CK-May-2024, NST Kernel 2024/w41 - 2024/w44
-
If docs needed, set a value
-
-
x86_64
-
-
None
Description of problem:
I am seeing OSLAT spikes above 20usec. This is quite rperoducible.
Spikes due to timer waking up on isolated cores
oslat-79159 [060] d...2.. 2480.504049: sched_switch: prev_comm=oslat prev_pid=79159 prev_prio=98 prev_state=R ==> next_comm=ktimers/60 next_pid=573 next_prio=88
<...>-573 [060] ...s.12 2480.504050: softirq_entry: vec=1 [action=TIMER]
<...>-573 [060] d..s113 2480.504051: timer_cancel: timer=00000000276551ad
<...>-573 [060] ...s.13 2480.504051: timer_expire_entry: timer=00000000276551ad function=tw_timer_handler now=4297146368 baseclk=4297146368
<...>-573 [060] ...s.13 2480.504054: timer_expire_exit: timer=00000000276551ad
<...>-573 [060] ...s.12 2480.504054: softirq_exit: vec=1 [action=TIMER]
<...>-573 [060] ...s.12 2480.504054: softirq_entry: vec=7 [action=SCHED]
<...>-573 [060] ...s.12 2480.504056: softirq_exit: vec=7 [action=SCHED]
<...>-573 [060] d...2.. 2480.504057: sched_switch: prev_comm=ktimers/60 prev_pid=573 prev_prio=88 prev_state=S ==> next_comm=ksoftirqd/60 next_pid=574 next_prio=88
<...>-574 [060] d...2.. 2480.504059: sched_switch: prev_comm=ksoftirqd/60 prev_pid=574 prev_prio=88 prev_state=S ==> next_comm=oslat next_pid=79159 next_prio=98
oslat-79159 [060] ....... 2480.504075: tracing_mark_write: oslat: Trace threshold (20 us) triggered with 26 us!
oslat-79159 [060] d...4.. 2480.504078: sched_wakeup: comm=oslat pid=79155 prio=98 target_cpu=010
BOOT_IMAGE=(hd0,gpt3)/ostree/rhcos-a9e8439864eb99b1974346c32fcd39c1b98563f7bc525ad6a13d4751aefc09fc/vmlinuz-5.14.0-284.23.1.rt14.308.el9_2.x86_64 ignition.platform.id=metal ostree=/ostree/boot.0/rhcos/a9e8439864eb99b1974346c32fcd39c1b98563f7bc525ad6a13d4751aefc09fc/0 root=UUID=cfbc2768-6425-4d00-9d48-21af39937a31 rw rootflags=prjquota boot=UUID=cc388c72-9224-4f92-979c-50e59f30384c systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller=1 crashkernel=512M skew_tick=1 nohz=on rcu_nocbs=2-55,58-111 tuned.non_isolcpus=03000000,00000003 systemd.cpu_affinity=0,1,56,57 intel_iommu=on iommu=pt isolcpus=managed_irq,2-55,58-111 nohz_full=2-55,58-111 nosoftlockup nmi_watchdog=0 mce=off rcutree.kthread_prio=11 default_hugepagesz=1G hugepagesz=1G hugepages=32 rcupdate.rcu_normal_after_boot=0 efi=runtime module_blacklist=irdma intel_pstate=disable tsc=reliable
Full trace attached
Version-Release number of selected component (if applicable):
OCP: 4.13.5
kernel: 5.14.0-284.23.1.rt14.308.el9_2.x86_64
How reproducible:
Cannot get through a 1hr run without hitting this
Steps to Reproduce:
1. Run OSLAT on a SNO
2.
3.
Actual results:
Expected results:
Additional info:
- external trackers
- links to
-
RHSA-2024:138410 kernel bug fix and enhancement update