Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-43995

The cpu affinity of burstable pods not get updated when deleting Guaranteed pod with runc

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • 4.18
    • Node / CRI-O
    • Important
    • None
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      With runc,  container process of burstable pods don't have their cpu affinity updated when a gu pod was deleted. 
      
      This issue didn't happen with crun. 

      Version-Release number of selected component (if applicable):

      4.18.0-0.nightly-2024-10-29-112337 

      How reproducible:

      always

      Steps to Reproduce:

          1. set up a ocp cluster 
          
          2. label one of the node as worker-affinity-tests node: 
      % oc label node ip-10-0-48-34.us-east-2.compute.internal node-role.kubernetes.io/worker-affinity-tests="" --overwrite   
        
          3. create a mcp for worker-affinity-tests node:
      apiVersion: machineconfiguration.openshift.io/v1
      kind: MachineConfigPool
      metadata:
        name: worker-affinity-tests
        labels:
          machineconfiguration.openshift.io/role: worker-affinity-tests
      spec:
        machineConfigSelector:
          matchExpressions:
            - {
                key: machineconfiguration.openshift.io/role,
                operator: In,
               values: [worker-affinity-tests, worker],
              }
        paused: false
        nodeSelector:
          matchLabels:
            node-role.kubernetes.io/worker-affinity-tests: ''
      
          4. create a kubeletconfig to enable cpuManager:
      apiVersion: machineconfiguration.openshift.io/v1
      kind: KubeletConfig
      metadata:
        name: set-cpu-manager
      spec:
        machineConfigPoolSelector:
          matchLabels:
            #custom-kubelet: cpumanager-enabled
            machineconfiguration.openshift.io/role: worker-affinity-tests
        kubeletConfig:
          cpuManagerPolicy: static
          cpuManagerReconcilePeriod: 6s
      
          5. create a containerRuntimeConfig to enable runc :
      apiVersion: machineconfiguration.openshift.io/v1
      kind: ContainerRuntimeConfig
      metadata:
        name: runc-ctrcfg
      spec:
        machineConfigPoolSelector:
          matchLabels:
            #pools.operator.machineconfiguration.openshift.io/worker: ""
            machineconfiguration.openshift.io/role: worker-affinity-tests
        containerRuntimeConfig:
          defaultRuntime: runc
      
          6. check the cpu affinity of a burstable pod (including all online cpu)
      % oc get pod tuned-vv5h7 -o yaml -n openshift-cluster-node-tuning-operator | grep -i qos   qosClass: Burstable 
      
      % oc get pod tuned-vv5h7 -o yaml -n openshift-cluster-node-tuning-operator | grep -i containerID   - containerID: cri-o://417864428a3fbaa2e754407d44e4c9b8480ed9cfba3842dff988a7777ec8c477 
      
      sh-5.1# crictl inspect 417864428a3fbaa2e754407d44e4c9b8480ed9cfba3842dff988a7777ec8c477 | grep -i pid      "pid": 2457,           "pids": { 
      
      sh-5.1# taskset  -pc 2457 pid 2457's current affinity list: 0-7
      
          7. create a gu pod and check the cpu affinity of this gu pod is as expected: 
      apiVersion: v1
      kind: Pod
      metadata:
        labels:
          app: rhel-ubi
        name: rhel-ubi-gu
      spec:
        restartPolicy: Always
        containers:
        - name: rhel-ubi
          stdin: true
          tty: true
          image: registry.access.redhat.com/ubi7/ubi:latest
          imagePullPolicy: Always
          resources:
           limits:
             cpu: 2
             memory: 200Mi
           requests:
             cpu: 2
             memory: 200Mi
        nodeName: ip-10-0-48-34.us-east-2.compute.internal
      
      sh-5.1# taskset -pc 10754
      pid 10754's current affinity list: 1,5
      
          8.check the cpu affinity of burstable pod got updated:
      sh-5.1# taskset -pc 2548
      pid 2548's current affinity list: 0,2-4,6,7
      
          9.delete the gu pod and check the cpu affinity of burstable pod:
      sh-5.1# taskset -pc 10754 // gu pod is deleted
      taskset: failed to set pid 10754's affinity: No such process
      
      sh-5.1# taskset -pc 2548
      pid 2548's current affinity list: 0,2-4,6,7
      
      sh-5.1# cat /proc/2548/cgroup 
      0::/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod617f9ce0_83ab_4af0_9454_f89b2b85c4cf.slice/crio-bc177d417010faf65d4c811b174b8d00fe4a7968b377f425ebc765dcb8fe461c.scope
      
      sh-5.1# cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod617f9ce0_83ab_4af0_9454_f89b2b85c4cf.slice/crio-bc177d417010faf65d4c811b174b8d00fe4a7968b377f425ebc765dcb8fe461c.scope/cpuset.cpus
      0,2-4,6-7
      
      sh-5.1# cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod617f9ce0_83ab_4af0_9454_f89b2b85c4cf.slice/criobc177d417010faf65d4c811b174b8d00fe4a7968b377f425ebc765dcb8fe461c.scope/cpuset.cpus.effective 
      0,2-4,6-7
      

      Actual results:

          9. the cpu affinity of burstable pod didn't got updated

      Expected results:

          9. the cpu affinity of burstable pod should got updated: 0-7

      Additional info:

          After step 9, I created a deployment including gu pod, then the cpu affinity of burstable pod got updated correctly. 

              pehunt@redhat.com Peter Hunt
              rhn-support-minmli Min Li
              Min Li Min Li
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: