-
Bug
-
Resolution: Done-Errata
-
Critical
-
4.13.0, 4.14.0
Description of problem:
All burstable pods run on cpus specified as reserved in the PerformanceProfile. This is really caused by the *systemd.cpu_affinity=<reserved>* kernel argument which is applied via the PerfProfile. This is problematic, because reserved can be just 1 - 4 cpus and the node capacity will allow many dozens pods to be crammed there. Not speaking about the infrastructure components... All pods should really run in the "isolated" space.
Version-Release number of selected component (if applicable):
4.14 CI builds of OCP as of today and yesterday (2023-06-15+) at least.
How reproducible:
Always
Steps to Reproduce:
1. oc debug node/<worker> 2. Find any burstable container process and run `taskset -pc <pid>` 3. Observe the cpu affinity contains all cpus on the node 4. Add a kernel argument systemd.cpu_affinity=0 (example MachineConfig is attached) 5. oc debug node/<worker> 6. Find any burstable container process again and run `taskset -pc <pid>` 7. Observe the cpu affinity again
Actual results:
Step 7. affinity matches the kernel argument value from step 4.
Expected results:
The step 7. affinity matches the affinity from steps 2-3.
Additional info:
The cgroup of the burstable pod is set up properly and contains all cpus (/sys/fs/cgroups/cpuset/kubepods.slice/.../cpuset.cpus) as expected.
- clones
-
OCPBUGS-15102 All burstable pods run with the reserved cpu affinity mask when PerformanceProfile is applied
- Closed
- is blocked by
-
OCPBUGS-15102 All burstable pods run with the reserved cpu affinity mask when PerformanceProfile is applied
- Closed
- links to
-
RHBA-2023:6846 OpenShift Container Platform 4.13.z bug fix update