-
Bug
-
Resolution: Unresolved
-
Major
-
rhel-9.2.0
-
kernel-5.14.0-430.el9
-
None
-
Medium
-
sst_kernel_rts
-
ssg_core_kernel
-
2
-
6
-
5
-
-
False
-
-
None
-
None
-
Pass
-
-
RegressionOnly
-
If docs needed, set a value
-
-
x86_64
-
-
None
Description of problem:
Om preempt-rt, after updating the SMP affinity of a network interrupt, the corresponding IRQ thread does not get updated immediately. This will be done when the next interrupt received and the thread is migrated to the proper core.
<idle>-0 [064] d...2.. 504437.282162: sched_switch: prev_comm=swapper/64 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=oslat next_pid=2563104 next_prio=98
<...>-2563104 [064] dn.h... 504439.963379: reschedule_entry: vector=253
<...>-2563104 [064] dn.h... 504439.963380: reschedule_exit: vector=253
<...>-2563104 [064] d...2.. 504439.963382: sched_switch: prev_comm=oslat prev_pid=2563104 prev_prio=98 prev_state=R ==> next_comm=irq/316-ice-eno next_pid=5596 next_prio=49
<...>-5596 [064] dn..4.. 504439.963388: sched_wakeup: comm=migration/64 pid=607 prio=0 target_cpu=064
<...>-5596 [064] d...2.. 504439.963389: sched_switch: prev_comm=irq/316-ice-eno prev_pid=5596 prev_prio=49 prev_state=R+ ==> next_comm=migration/64 next_pid=607 next_prio=0
<...>-607 [064] d...2.. 504439.963395: sched_switch: prev_comm=migration/64 prev_pid=607 prev_prio=0 prev_state=S ==> next_comm=oslat next_pid=2563104 next_prio=98
<...>-2563104 [064] ....... 504439.963410: tracing_mark_write: oslat: Trace threshold (20 us) triggered with 20 us!
This is problematic if the interrupt was originally affined to an isolated core as the migration will cause a latency spike. In a real deployment this is less of an issue as this as the situation heals itself after 1 interrupt. It is problematic for benchmark testing with oslat etc. as this can cause spikes that potentially exceed the threshold.
Ideally the thread would get reaffined when the SMP affinity for the IRQ is updated
Version-Release number of selected component (if applicable):
9.2 but this will exist in earlier versions as well
How reproducible:
100% if an interrupt thread was originally pinned to an isolated core and an interrupt is received.
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
- causes
-
OCPBUGS-32031 oslat 45us spike 1h run on 4.14.20
- ASSIGNED
- external trackers
- links to
-
RHSA-2024:128795 kernel update