Details
-
Bug
-
Resolution: Done
-
Undefined
-
None
-
4.12
-
None
-
No
-
False
-
Description
Description of problem:
I tried to deploy pods with Guaranteed QosClass to have dedicated CPU cores for my workload. I was using a Performance Profile but I have made a mistake and I created an addition basic Kubeletconfig (to enable an unsafe sysctl option). The issue is, I got two KubeletConfig, one generated by the PerformanceProfile and the other I have created manually. The problem is when I asked for Guaranteed QoS pods, they were shown as Guaranteed but when looking at the pod affinity, these pods were able to use any CPU cores. It was because, the KubeletConfig I created manually was shadowing the one generated by the Performance Profile, so the CPU manager policy was not set to static. I was able to solve this by adding the proper annotation in the PerformanceProfile and get ride of the other kubeletConfig so I managed to deploy my QoS pods. My concern is about what OCP was showing about my pods: they were displayed as QoS Guaranteed but as the CPU Manager policy was not set, they were not using dedicated resources. Is it expected? Should it display an error saying it is not possible to create Guaranteed GoS pods?
Version-Release number of selected component (if applicable):
4.12.45
How reproducible:
Create a Performance Profile, then create a dummy KubeletConfig and try to deploy pods with Guaranteed QoS Class.
Steps to Reproduce:
1. oc create -f performance-profile.yml then wait until your nodes are Ready 2. oc create -f kubelet-config.yml then wait until your nodes are Ready 3. oc create -f pod-sample.yml 4. oc describe pod
Actual results:
Pods are display as QoS Guaranteed
Expected results:
Pods must be describe as BestEffort
Additional info:
pod-sample.yml: apiVersion: v1 kind: Pod metadata: name: gutest annotations: # Disable CFS cpu quota accounting cpu-quota.crio.io: "disable" # Disable CPU balance with CRIO cpu-load-balancing.crio.io: "disable" # Opt-out from interrupt handling irq-load-balancing.crio.io: "disable" spec: # Map to the correct performance class runtimeClassName: performance-blueprint-profile containers: - name: main image: registry.dfwt5g.lab:4443/rh-test/ubi8-micro command: [ "/bin/sh", "-c", "--" ] args: [ "while true; do sleep 99999999; done;" ] resources: limits: memory: "300Mi" cpu: "2" request: memory: "300Mi" cpu: "2" ... performance-profile.yml: --- kind: PerformanceProfile apiVersion: "performance.openshift.io/v2" metadata: name: blueprint-profile spec: cpu: isolated: "1-19,21-39,41-59,61-79" reserved: "0,40,20,60" additionalKernelArgs: - nohz_full=1-19,21-39,41-59,61-79 hugepages: pages: - size: "1G" count: 32 node: 0 - size: "1G" count: 32 node: 1 - size: "2M" count: 12000 node: 0 - size: "2M" count: 12000 node: 1 realTimeKernel: enabled: false workloadHints: realTime: false highPowerConsumption: false perPodPowerManagement: true net: userLevelNetworking: false numa: topologyPolicy: "single-numa-node" nodeSelector: node-role.kubernetes.io/worker: "" ... kubelet-config.yml: apiVersion: machineconfiguration.openshift.io/v1 kind: KubeletConfig metadata: name: custom-kubelet spec: machineConfigPoolSelector: matchLabels: custom-kubelet: sysctl kubeletConfig: allowedUnsafeSysctls: - "net.ipv4.tcp_tw_reuse"