Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-9148

Interrupt thread not affined after interrupt reaffined

    • kernel-5.14.0-430.el9
    • None
    • Medium
    • sst_kernel_rts
    • ssg_core_kernel
    • 2
    • 6
    • 5
    • Hide

      10/30: Yellow. No fix identified at this point, unable to forecast.

      10/02: Still investigating. R ed for Oct 5th

       

      Show
      10/30: Yellow. No fix identified at this point, unable to forecast. 10/02: Still investigating. R ed for Oct 5th  
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • If docs needed, set a value
    • x86_64
    • None

      Description of problem:

      Om preempt-rt, after updating the SMP affinity of a network interrupt, the corresponding IRQ thread does not get updated immediately. This will be done when the next interrupt received and the thread is migrated to the proper core.

      <idle>-0 [064] d...2.. 504437.282162: sched_switch: prev_comm=swapper/64 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=oslat next_pid=2563104 next_prio=98
      <...>-2563104 [064] dn.h... 504439.963379: reschedule_entry: vector=253
      <...>-2563104 [064] dn.h... 504439.963380: reschedule_exit: vector=253
      <...>-2563104 [064] d...2.. 504439.963382: sched_switch: prev_comm=oslat prev_pid=2563104 prev_prio=98 prev_state=R ==> next_comm=irq/316-ice-eno next_pid=5596 next_prio=49
      <...>-5596 [064] dn..4.. 504439.963388: sched_wakeup: comm=migration/64 pid=607 prio=0 target_cpu=064
      <...>-5596 [064] d...2.. 504439.963389: sched_switch: prev_comm=irq/316-ice-eno prev_pid=5596 prev_prio=49 prev_state=R+ ==> next_comm=migration/64 next_pid=607 next_prio=0
      <...>-607 [064] d...2.. 504439.963395: sched_switch: prev_comm=migration/64 prev_pid=607 prev_prio=0 prev_state=S ==> next_comm=oslat next_pid=2563104 next_prio=98
      <...>-2563104 [064] ....... 504439.963410: tracing_mark_write: oslat: Trace threshold (20 us) triggered with 20 us!

      This is problematic if the interrupt was originally affined to an isolated core as the migration will cause a latency spike. In a real deployment this is less of an issue as this as the situation heals itself after 1 interrupt. It is problematic for benchmark testing with oslat etc. as this can cause spikes that potentially exceed the threshold.

      Ideally the thread would get reaffined when the SMP affinity for the IRQ is updated

      Version-Release number of selected component (if applicable):
      9.2 but this will exist in earlier versions as well

      How reproducible:
      100% if an interrupt thread was originally pinned to an isolated core and an interrupt is received.

      Steps to Reproduce:
      1.
      2.
      3.

      Actual results:

      Expected results:

      Additional info:

            crystalwood Crystal Wood
            browsell@redhat.com Brent Rowsell
            Brent Rowsell
            Crystal Wood Crystal Wood
            Qiao Zhao Qiao Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            18 Start watching this issue

              Created:
              Updated: