Loading...

XML

Word

Printable

Type: Bug
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: 4.18
Component/s: Node / CRI-O
Labels:
- triaged

Severity:
Important
Regression:
None
Release Blocker:
Rejected
Blocked:
False
Blocked Reason:

Hide

None

Show
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Description of problem:

With runc,  container process of burstable pods don't have their cpu affinity updated when a gu pod was deleted. 

This issue didn't happen with crun.

Version-Release number of selected component (if applicable):

4.18.0-0.nightly-2024-10-29-112337

How reproducible:

always

Steps to Reproduce:

    1. set up a ocp cluster 
    
    2. label one of the node as worker-affinity-tests node: 
% oc label node ip-10-0-48-34.us-east-2.compute.internal node-role.kubernetes.io/worker-affinity-tests="" --overwrite   
  
    3. create a mcp for worker-affinity-tests node:
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
  name: worker-affinity-tests
  labels:
    machineconfiguration.openshift.io/role: worker-affinity-tests
spec:
  machineConfigSelector:
    matchExpressions:
      - {
          key: machineconfiguration.openshift.io/role,
          operator: In,
         values: [worker-affinity-tests, worker],
        }
  paused: false
  nodeSelector:
    matchLabels:
      node-role.kubernetes.io/worker-affinity-tests: ''

    4. create a kubeletconfig to enable cpuManager:
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: set-cpu-manager
spec:
  machineConfigPoolSelector:
    matchLabels:
      #custom-kubelet: cpumanager-enabled
      machineconfiguration.openshift.io/role: worker-affinity-tests
  kubeletConfig:
    cpuManagerPolicy: static
    cpuManagerReconcilePeriod: 6s

    5. create a containerRuntimeConfig to enable runc :
apiVersion: machineconfiguration.openshift.io/v1
kind: ContainerRuntimeConfig
metadata:
  name: runc-ctrcfg
spec:
  machineConfigPoolSelector:
    matchLabels:
      #pools.operator.machineconfiguration.openshift.io/worker: ""
      machineconfiguration.openshift.io/role: worker-affinity-tests
  containerRuntimeConfig:
    defaultRuntime: runc

    6. check the cpu affinity of a burstable pod (including all online cpu)
% oc get pod tuned-vv5h7 -o yaml -n openshift-cluster-node-tuning-operator | grep -i qos   qosClass: Burstable 

% oc get pod tuned-vv5h7 -o yaml -n openshift-cluster-node-tuning-operator | grep -i containerID   - containerID: cri-o://417864428a3fbaa2e754407d44e4c9b8480ed9cfba3842dff988a7777ec8c477 

sh-5.1# crictl inspect 417864428a3fbaa2e754407d44e4c9b8480ed9cfba3842dff988a7777ec8c477 | grep -i pid      "pid": 2457,           "pids": { 

sh-5.1# taskset  -pc 2457 pid 2457's current affinity list: 0-7

    7. create a gu pod and check the cpu affinity of this gu pod is as expected: 
apiVersion: v1
kind: Pod
metadata:
  labels:
    app: rhel-ubi
  name: rhel-ubi-gu
spec:
  restartPolicy: Always
  containers:
  - name: rhel-ubi
    stdin: true
    tty: true
    image: registry.access.redhat.com/ubi7/ubi:latest
    imagePullPolicy: Always
    resources:
     limits:
       cpu: 2
       memory: 200Mi
     requests:
       cpu: 2
       memory: 200Mi
  nodeName: ip-10-0-48-34.us-east-2.compute.internal

sh-5.1# taskset -pc 10754
pid 10754's current affinity list: 1,5

    8.check the cpu affinity of burstable pod got updated:
sh-5.1# taskset -pc 2548
pid 2548's current affinity list: 0,2-4,6,7

    9.delete the gu pod and check the cpu affinity of burstable pod:
sh-5.1# taskset -pc 10754 // gu pod is deleted
taskset: failed to set pid 10754's affinity: No such process

sh-5.1# taskset -pc 2548
pid 2548's current affinity list: 0,2-4,6,7

sh-5.1# cat /proc/2548/cgroup 
0::/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod617f9ce0_83ab_4af0_9454_f89b2b85c4cf.slice/crio-bc177d417010faf65d4c811b174b8d00fe4a7968b377f425ebc765dcb8fe461c.scope

sh-5.1# cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod617f9ce0_83ab_4af0_9454_f89b2b85c4cf.slice/crio-bc177d417010faf65d4c811b174b8d00fe4a7968b377f425ebc765dcb8fe461c.scope/cpuset.cpus
0,2-4,6-7

sh-5.1# cat /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod617f9ce0_83ab_4af0_9454_f89b2b85c4cf.slice/criobc177d417010faf65d4c811b174b8d00fe4a7968b377f425ebc765dcb8fe461c.scope/cpuset.cpus.effective 
0,2-4,6-7

Actual results:

    9. the cpu affinity of burstable pod didn't got updated

Expected results:

    9. the cpu affinity of burstable pod should got updated: 0-7

Additional info:

    After step 9, I created a deployment including gu pod, then the cpu affinity of burstable pod got updated correctly.

is related to

RHEL-57709 ovs-vswitchd process affinity doesn't get changed back to it's original affinity when deployment running guaranteed pods is deleted

Assignee:: Peter Hunt

Reporter:: Min Li

QA Contact:: Min Li

Votes:: 0 Vote for this issue

Watchers:: 8 Start watching this issue

Created:: 2024/10/30 9:05 AM

Updated:: 2024/11/20 5:56 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates