Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-15102

All burstable pods run with the reserved cpu affinity mask when PerformanceProfile is applied

XMLWordPrintable

    • Critical
    • No
    • CNF Compute Sprint 238, CNF Compute Sprint 239, CNF Compute Sprint 240, CNF Compute Sprint 241, CNF Compute Sprint 242, CNF Compute Sprint 243
    • 6
    • Approved
    • False
    • Hide

      None

      Show
      None
    • Hide
      9/26: crun/runc patches actively worked on by node team - new PR posted
      9/5: 9.3 kernel API change merged, waiting on QE to allow the 9.2 backport process to start
      8/21: pending on fix for RHELPLAN-161539, which has RHEL 9 merge requests up; Green
      7/25: workaround being used successfully, QE is unblocked & TCFP; full solution pending RHELPLAN-161539
      Show
      9/26: crun/runc patches actively worked on by node team - new PR posted 9/5: 9.3 kernel API change merged, waiting on QE to allow the 9.2 backport process to start 8/21: pending on fix for RHELPLAN-161539, which has RHEL 9 merge requests up; Green 7/25: workaround being used successfully, QE is unblocked & TCFP; full solution pending RHELPLAN-161539

      Description of problem:

      All burstable pods run on cpus specified as reserved in the PerformanceProfile. This is really caused by the *systemd.cpu_affinity=<reserved>* kernel argument which is applied via the PerfProfile.
      
      This is problematic, because reserved can be just 1 - 4 cpus and the node capacity will allow many dozens pods to be crammed there. Not speaking about the infrastructure components...
      
      All pods should really run in the "isolated" space.

      Version-Release number of selected component (if applicable):

      4.14 CI builds of OCP as of today and yesterday (2023-06-15+) at least.

      How reproducible:

      Always

      Steps to Reproduce:

      1. oc debug node/<worker>
      2. Find any burstable container process and run `taskset -pc <pid>`
      3. Observe the cpu affinity contains all cpus on the node
      
      4. Add a kernel argument systemd.cpu_affinity=0 (example MachineConfig is attached)
      5. oc debug node/<worker>
      6. Find any burstable container process again and run `taskset -pc <pid>`
      7. Observe the cpu affinity again
      

      Actual results:

      Step 7. affinity matches the kernel argument value from step 4.

      Expected results:

      The step 7. affinity matches the affinity from steps 2-3.

      Additional info:

      The cgroup of the burstable pod is set up properly and contains all cpus (/sys/fs/cgroups/cpuset/kubepods.slice/.../cpuset.cpus) as expected.

              msivak@redhat.com Martin Sivak
              msivak@redhat.com Martin Sivak
              Mallapadi Niranjan Mallapadi Niranjan
              Peter Hunt
              Votes:
              0 Vote for this issue
              Watchers:
              21 Start watching this issue

                Created:
                Updated:
                Resolved: