Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-26400

tuned: tuned breaks dynamic IRQ affinity

XMLWordPrintable

    • +
    • Important
    • No
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      * Previously, the `tuned` and `irqbalanced` daemons modified the Interrupt Request (IRQ) CPU affinity configuration. As a consequence, conflicts in the IRQ CPU affinity configuration might cause unexpected behaviour after a {sno} node restart. With this release, only the `irqbalanced` daemon determines IRQ CPU affinity configuration.(link:https://issues.redhat.com/browse/OCPBUGS-26400[*OCPBUGS-26400*])
      Show
      * Previously, the `tuned` and `irqbalanced` daemons modified the Interrupt Request (IRQ) CPU affinity configuration. As a consequence, conflicts in the IRQ CPU affinity configuration might cause unexpected behaviour after a {sno} node restart. With this release, only the `irqbalanced` daemon determines IRQ CPU affinity configuration.(link: https://issues.redhat.com/browse/OCPBUGS-26400 [* OCPBUGS-26400 *])
    • Bug Fix
    • Done
    • Hide
      2024-03-12: NTO 4.16 PR under review + 4.15 is ready as well to be merged once 4.16 passes validation
      2024-03-05: https://issues.redhat.com/browse/RHEL-21923 is in , NTO PR u/s ready and on track (once merged it will be backported to 4.15 and 4.14)
      Show
      2024-03-12: NTO 4.16 PR under review + 4.15 is ready as well to be merged once 4.16 passes validation 2024-03-05: https://issues.redhat.com/browse/RHEL-21923 is in , NTO PR u/s ready and on track (once merged it will be backported to 4.15 and 4.14)

      Description of problem:

      If GloballyDisableIrqLoadBalancing in disabled in the performance profile then irqs should be balanced across all cpus minus the cpus that are explicitly removed by crio via the pod annotation irq-load-balancing.crio.io: "disable"
      
      There's an issue when the scheduler plugin in tuned will attempt to affine all irqs to the non-isolated cores. Isolated here means non-reserved, not truly isolated cores. This is directly at odds with the user intent. So now we have tuned fighting with crio/irqbalance both trying to do different things. 
      
      Scenarios
      - If a pod get’s launched with the annotation after tuned has started, runtime or after a reboot - ok 
      - On a reboot if tuned recovers after the guaranteed pod has been launched - broken
      - If tuned restarts at runtime for any reason - broken

      Version-Release number of selected component (if applicable):

         4.14 and likely earlier

      How reproducible:

          See description

      Steps to Reproduce:

          1.See description 
          2.
          3.
          

      Actual results:

          

      Expected results:

          

      Additional info:

          

       

            yquinn@redhat.com Yanir Quinn
            browsell@redhat.com Brent Rowsell
            Shereen Haj Shereen Haj
            Votes:
            1 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: