-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.18
-
Important
-
None
-
Rejected
-
False
-
Description of problem:
With runc, container process of burstable pods don't have their cpu affinity updated when a gu pod was deleted. This issue didn't happen with crun.
Version-Release number of selected component (if applicable):
4.18.0-0.nightly-2024-10-29-112337
How reproducible:
always
Steps to Reproduce:
1. set up a ocp cluster 2. label one of the node as worker-affinity-tests node: % oc label node ip-10-0-48-34.us-east-2.compute.internal node-role.kubernetes.io/worker-affinity-tests="" --overwrite 3. create a mcp for worker-affinity-tests node: apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfigPool metadata: name: worker-affinity-tests labels: machineconfiguration.openshift.io/role: worker-affinity-tests spec: machineConfigSelector: matchExpressions: - { key: machineconfiguration.openshift.io/role, operator: In, values: [worker-affinity-tests, worker], } paused: false nodeSelector: matchLabels: node-role.kubernetes.io/worker-affinity-tests: '' 4. create a kubeletconfig to enable cpuManager: apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: name: set-cpu-manager spec: machineConfigPoolSelector: matchLabels: #custom-kubelet: cpumanager-enabled machineconfiguration.openshift.io/role: worker-affinity-tests kubeletConfig: cpuManagerPolicy: static cpuManagerReconcilePeriod: 6s 5. create a containerRuntimeConfig to enable runc : apiVersion: machineconfiguration.openshift.io/v1 kind: ContainerRuntimeConfig metadata: name: runc-ctrcfg spec: machineConfigPoolSelector: matchLabels: #pools.operator.machineconfiguration.openshift.io/worker: "" machineconfiguration.openshift.io/role: worker-affinity-tests containerRuntimeConfig: defaultRuntime: runc 6. check the cpu affinity of a burstable pod (including all online cpu) % oc get pod tuned-vv5h7 -o yaml -n openshift-cluster-node-tuning-operator | grep -i qos qosClass: Burstable % oc get pod tuned-vv5h7 -o yaml -n openshift-cluster-node-tuning-operator | grep -i containerID - containerID: cri-o://417864428a3fbaa2e754407d44e4c9b8480ed9cfba3842dff988a7777ec8c477 sh-5.1# crictl inspect 417864428a3fbaa2e754407d44e4c9b8480ed9cfba3842dff988a7777ec8c477 | grep -i pid "pid": 2457, "pids": { sh-5.1# taskset -pc 2457 pid 2457's current affinity list: 0-7 7. create a gu pod and check the cpu affinity of this gu pod is as expected: apiVersion: v1 kind: Pod metadata: labels: app: rhel-ubi name: rhel-ubi-gu spec: restartPolicy: Always containers: - name: rhel-ubi stdin: true tty: true image: registry.access.redhat.com/ubi7/ubi:latest imagePullPolicy: Always resources: limits: cpu: 2 memory: 200Mi requests: cpu: 2 memory: 200Mi nodeName: ip-10-0-48-34.us-east-2.compute.internal sh-5.1# taskset -pc 10754 pid 10754's current affinity list: 1,5 8.check the cpu affinity of burstable pod got updated: sh-5.1# taskset -pc 2548 pid 2548's current affinity list: 0,2-4,6,7 9.delete the gu pod and check the cpu affinity of burstable pod: sh-5.1# taskset -pc 10754 // gu pod is deleted taskset: failed to set pid 10754's affinity: No such process sh-5.1# taskset -pc 2548 pid 2548's current affinity list: 0,2-4,6,7 sh-5.1# cat /proc/2548/cgroup 0::/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod617f9ce0_83ab_4af0_9454_f89b2b85c4cf.slice/crio-bc177d417010faf65d4c811b174b8d00fe4a7968b377f425ebc765dcb8fe461c.scope sh-5.1# cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod617f9ce0_83ab_4af0_9454_f89b2b85c4cf.slice/crio-bc177d417010faf65d4c811b174b8d00fe4a7968b377f425ebc765dcb8fe461c.scope/cpuset.cpus 0,2-4,6-7 sh-5.1# cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod617f9ce0_83ab_4af0_9454_f89b2b85c4cf.slice/criobc177d417010faf65d4c811b174b8d00fe4a7968b377f425ebc765dcb8fe461c.scope/cpuset.cpus.effective 0,2-4,6-7
Actual results:
9. the cpu affinity of burstable pod didn't got updated
Expected results:
9. the cpu affinity of burstable pod should got updated: 0-7
Additional info:
After step 9, I created a deployment including gu pod, then the cpu affinity of burstable pod got updated correctly.
- is related to
-
RHEL-57709 ovs-vswitchd process affinity doesn't get changed back to it's original affinity when deployment running guaranteed pods is deleted
- New