-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.20
-
None
-
None
-
False
-
-
None
-
Important
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
This issue has been seen as part of the porting of OVS-DPDK with OCP Virt. It can be reproduced with a simple OCP 4.20 installation.
Requesting isolated and reserved cores, with strict cpu reservation ends up with the worker node to be locked up. Only solution is to log on the node, and remove the /var/lib/kubelet/cpu_manager_state file.
Version-Release number of selected component (if applicable):
$ oc version Client Version: 4.13.7 Kustomize Version: v4.5.7 Server Version: 4.20.0-0.nightly-2026-03-02-014334 Kubernetes Version: v1.33.8 $ oc get deployment cluster-node-tuning-operator -n openshift-cluster-node-tuning-operator -o jsonpath='{.spec.template.spec.containers[0].image}{"\n"}' quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:e8ff6bfeed2963b7955452fe752dd15f903b7415fdf017f5c1e6d2a290b39f26
How reproducible: 100%
Steps to Reproduce:
1.
$ cat pp.yaml
apiVersion: performance.openshift.io/v2
kind: PerformanceProfile
metadata:
labels:
machineconfiguration.openshift.io/role: worker
name: ovs-dpdk-worker
annotations:
kubeletconfig.experimental: |
{"cpuManagerPolicyOptions": {"full-pcpus-only": "true", "strict-cpu-reservation": "true"}}
spec:
additionalKernelArgs:
- "enforcing=0"
- "br-phys-bind=enp2s0"
cpu:
isolated: "2-11,14-23"
reserved: "0-1,12-13"
hugepages:
defaultHugepagesSize: "2M"
pages:
- size: "2M"
count: 8192
nodeSelector:
node-role.kubernetes.io/worker: ''
numa:
# Ref.: https://kubernetes.io/docs/tasks/administer-cluster/topology-manager/
topologyPolicy: "restricted"
$ oc apply -f pp.yaml
Actual results:
$ oc get nodes -w ... w0 NotReady,SchedulingDisabled worker 106m v1.33.8
Expected results:
$ oc get nodes -w ... w0 Ready worker 106m v1.33.8
Additional info:
Before applying the pp, some logs were gathered:
[root@w0 ~]# cat /proc/cmdline
BOOT_IMAGE=(hd0,gpt3)/boot/ostree/rhcos-8ffde1af9e76874598344c2754660ead6f2f451919c716a81fadf450d2011ef7/vmlinuz-5.14.0-570.95.1.el9_6.x86_64 rw ostree=/ostree/boot.0/rhcos/8ffde1af9e76874598344c2754660ead6f2f451919c716a81fadf450d2011ef7/0 ignition.platform.id=metal ip=dhcp root=UUID=02d81ee3-daa3-43f7-ae88-a22c50a045e3 rw rootflags=prjquota boot=UUID=98e1c771-1117-4433-976a-61c1e58572d7 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all psi=0 irqpoll console=tty0 console=ttyS0,115200 earlyprintk=ttyS0,115200
[root@w0 ~]# taskset -pc 1
pid 1's current affinity list: 0-23
[root@w0 ~]# cat /var/lib/kubelet/cpu_manager_state
{"policyName":"none","defaultCpuSet":"","checksum":1353318690}
After applying the pp, the node got rebooted, and then:
$ oc get nodes -w ... w0 NotReady,SchedulingDisabled worker 106m v1.33.8 $ oc get pods -A -o wide | grep machine-config.*w0 openshift-machine-config-operator kube-rbac-proxy-crio-w0 1/1 Running 3 107m 192.168.158.32 w0 <none> <none> openshift-machine-config-operator machine-config-daemon-hq6ml 2/2 Running 2 107m 192.168.158.32 w0 <none> <none> $ oc logs -n openshift-machine-config-operator -f machine-config-daemon-hq6ml Defaulted container "machine-config-daemon" out of: machine-config-daemon, kube-rbac-proxy Error from server: Get "https://192.168.158.32:10250/containerLogs/openshift-machine-config-operator/machine-config-daemon-hq6ml/machine-config-daemon?follow=true": dial tcp 192.168.158.32:10250: connect: connection refused [root@w0 ~]# cat /proc/cmdline BOOT_IMAGE=(hd0,gpt3)/boot/ostree/rhcos-8ffde1af9e76874598344c2754660ead6f2f451919c716a81fadf450d2011ef7/vmlinuz-5.14.0-570.95.1.el9_6.x86_64 rw ostree=/ostree/boot.0/rhcos/8ffde1af9e76874598344c2754660ead6f2f451919c716a81fadf450d2011ef7/0 ignition.platform.id=metal ip=dhcp root=UUID=02d81ee3-daa3-43f7-ae88-a22c50a045e3 rw rootflags=prjquota boot=UUID=98e1c771-1117-4433-976a-61c1e58572d7 skew_tick=1 tsc=reliable rcupdate.rcu_normal_after_boot=1 rcutree.nohz_full_patience_delay=1000 nohz=on rcu_nocbs=2-11,14-23 tuned.non_isolcpus=00003003 systemd.cpu_affinity=0,1,12,13 intel_iommu=on iommu=pt isolcpus=managed_irq,2-11,14-23 nohz_full=2-11,14-23 tsc=reliable nosoftlockup nmi_watchdog=0 mce=off skew_tick=1 rcutree.kthread_prio=11 default_hugepagesz=2M hugepagesz=2M hugepages=8192 intel_pstate=disable enforcing=0 br-phys-bind=enp2s0 systemd.unified_cgroup_hierarchy=1 cgroup_no_v1=all psi=0 irqpoll console=tty0 console=ttyS0,115200 earlyprintk=ttyS0,115200 [root@w0 ~]# taskset -pc 1 pid 1's current affinity list: 0,1,12,13 [root@w0 ~]# cat /var/lib/kubelet/cpu_manager_state {"policyName":"static","defaultCpuSet":"2-11,14-23","checksum":221304858}[root@w0 ~]#
We can see the first time kubelet started ended with an error:
[root@w0 ~]# journalctl -b 0 Mar 03 13:29:51 localhost kernel: Linux version 5.14.0-570.95.1.el9_6.x86_64 (mockbuild@x86-64-04.build.eng.rdu2.redhat.com) (gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5), GNU ld version 2.35.2-63.el9_6.1) #1 SMP PREEMPT_DYNAMIC Thu Feb > ... Mar 03 13:31:04 w0 crio[3260]: time="2026-03-03T13:31:04.755534694Z" level=info msg="Successfully cleaned up network for pod 781ae9671900921a2bfbe2172b1351712753cab94f1b3a8451ecd4de0ab37a5c" Mar 03 13:31:04 w0 kubenswrapper[3323]: Flag --container-runtime-endpoint has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-clus ter/kubelet-config-file/ for more information. ... Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.828497 3323 flags.go:64] FLAG: --cpu-manager-policy="none" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.828503 3323 flags.go:64] FLAG: --cpu-manager-policy-options="" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.828512 3323 flags.go:64] FLAG: --cpu-manager-reconcile-period="10s" ... Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.910261 3323 container_manager_linux.go:306] "Creating device plugin manager" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.910272 3323 manager.go:141] "Creating Device Plugin manager" path="/var/lib/kubelet/device-plugins/kubelet.sock" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.910302 3323 server.go:72] "Creating device plugin registration server" version="v1beta1" socket="/var/lib/kubelet/device-plugins/kubelet.sock" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.910660 3323 cpu_manager.go:179] "Detected CPU topology" topology={"NumCPUs":24,"NumCores":24,"NumUncoreCache":24,"NumSockets":24,"NumNUMANodes":1,"CPUDetails":{"0":{"NUMANodeID":0,"SocketID":0,"CoreID":0,"UncoreCacheID":0},"1":{"NUMANodeID":0,"SocketID":1,"CoreID":1,"UncoreCacheID":1},"10":{"NUMANodeID":0,"SocketID":10,"CoreID":10,"UncoreCacheID":10},"11":{"NUMANodeID":0,"SocketID":11,"CoreID":11,"UncoreCacheID":11},"12":{"NUMANodeID":0,"SocketID":12,"CoreID":12,"UncoreCacheID":12},"13":{"NUMANodeID":0,"SocketID":13,"CoreID":13,"UncoreCacheID":13},"14":{"NUMANodeID":0,"SocketID":14,"CoreID":14,"UncoreCacheID":14},"15":{"NUMANodeID":0,"SocketID":15,"CoreID":15,"UncoreCacheID":15},"16":{"NUMANodeID":0,"SocketID":16,"CoreID":16,"UncoreCacheID":16},"17":{"NUMANodeID":0,"SocketID":17,"CoreID":17,"UncoreCacheID":17},"18":{"NUMANodeID":0,"SocketID":18,"CoreID":18,"UncoreCacheID":18},"19":{"NUMANodeID":0,"SocketID":19,"CoreID":19,"UncoreCacheID":19},"2":{"NUMANodeID":0,"SocketID":2,"CoreID":2,"UncoreCacheID":2},"20":{"NUMANodeID":0,"SocketID":20,"CoreID":20,"UncoreCacheID":20},"21":{"NUMANodeID":0,"SocketID":21,"CoreID":21,"UncoreCacheID":21},"22":{"NUMANodeID":0,"SocketID":22,"CoreID":22,"UncoreCacheID":22},"23":{"NUMANodeID":0,"SocketID":23,"CoreID":23,"UncoreCacheID":23},"3":{"NUMANodeID":0,"SocketID":3,"CoreID":3,"UncoreCacheID":3},"4":{"NUMANodeID":0,"SocketID":4,"CoreID":4,"UncoreCacheID":4},"5":{"NUMANodeID":0,"SocketID":5,"CoreID":5,"UncoreCacheID":5},"6":{"NUMANodeID":0,"SocketID":6,"CoreID":6,"UncoreCacheID":6},"7":{"NUMANodeID":0,"SocketID":7,"CoreID":7,"UncoreCacheID":7},"8":{"NUMANodeID":0,"SocketID":8,"CoreID":8,"UncoreCacheID":8},"9":{"NUMANodeID":0,"SocketID":9,"CoreID":9,"UncoreCacheID":9}}} Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.910740 3323 policy_static.go:145] "Static policy created with configuration" options={"FullPhysicalCPUsOnly":true,"DistributeCPUsAcrossNUMA":false,"AlignBySocket":false,"DistributeCPUsAcrossCores":false,"StrictCPUReservation":true,"PreferAlignByUncoreCacheOption":false} cpuGroupSize=1 Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.910801 3323 policy_static.go:182] "Reserved CPUs not available for exclusive assignment" reservedSize=4 reserved="0-1,12-13" reservedPhysicalCPUs="0-1,12-13" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.910820 3323 state_mem.go:36] "Initialized new in-memory state store" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.910996 3323 server.go:1267] "Using root directory" path="/var/lib/kubelet" ... Mar 03 13:31:04 w0 systemd[1]: Startup finished in 1.752s (kernel) + 3.093s (initrd) + 1min 10.718s (userspace) = 1min 15.564s. Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.978041 3323 manager.go:324] Recovery completed Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.986759 3323 kubelet_node_status.go:413] "Setting node annotation to enable volume controller attach/detach" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.987607 3323 kubelet_node_status.go:736] "Recording event message for node" node="w0" event="NodeHasSufficientMemory" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.987632 3323 kubelet_node_status.go:736] "Recording event message for node" node="w0" event="NodeHasNoDiskPressure" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.987650 3323 kubelet_node_status.go:736] "Recording event message for node" node="w0" event="NodeHasSufficientPID" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.995779 3323 cpu_manager.go:222] "Starting CPU manager" policy="static" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.995800 3323 cpu_manager.go:223] "Reconciling" reconcilePeriod="5s" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.995835 3323 state_mem.go:36] "Initialized new in-memory state store" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.998085 3323 state_mem.go:88] "Updated default CPUSet" cpuSet="2-11,14-23" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.999675 3323 policy_static.go:218] "Static policy initialized" defaultCPUSet="2-11,14-23" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.999789 3323 memory_manager.go:186] "Starting memorymanager" policy="Static" Mar 03 13:31:04 w0 kubenswrapper[3323]: I0303 13:31:04.999812 3323 state_mem.go:35] "Initializing new in-memory state store" Mar 03 13:31:05 w0 kubenswrapper[3323]: I0303 13:31:05.001452 3323 state_mem.go:75] "Updated machine memory state" Mar 03 13:31:05 w0 systemd[1]: Created slice libcontainer container kubepods.slice. Mar 03 13:31:05 w0 kernel: Warning: Unmaintained driver is detected: nft_compat Mar 03 13:31:05 w0 kubenswrapper[3323]: E0303 13:31:05.036524 3323 kubelet_node_status.go:515] "Error getting the current node from lister" err="node \"w0\" not found" Mar 03 13:31:05 w0 kubenswrapper[3323]: E0303 13:31:05.036698 3323 kubelet.go:1706] "Failed to start ContainerManager" err="failed to initialize top level QOS containers: error validating root container [kubepods] : cgroup [\"kubepods\"] has some missing controllers: cpuset" Mar 03 13:31:05 w0 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE Mar 03 13:31:05 w0 systemd[1]: kubelet.service: Failed with result 'exit-code'.
Followed by a restart, and a different error:
... Mar 03 13:31:15 w0 systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 1. Mar 03 13:31:15 w0 systemd[1]: Stopped Kubernetes Kubelet. Mar 03 13:31:15 w0 systemd[1]: Starting Kubernetes Kubelet... Mar 03 13:31:15 w0 kubenswrapper[3387]: Flag --container-runtime-endpoint has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-clus ter/kubelet-config-file/ for more information. ... Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.318200 3387 flags.go:64] FLAG: --cpu-manager-policy="none" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.318204 3387 flags.go:64] FLAG: --cpu-manager-policy-options="" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.318213 3387 flags.go:64] FLAG: --cpu-manager-reconcile-period="10s" ... Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.349623 3387 container_manager_linux.go:306] "Creating device plugin manager" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.349632 3387 manager.go:141] "Creating Device Plugin manager" path="/var/lib/kubelet/device-plugins/kubelet.sock" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.349654 3387 server.go:72] "Creating device plugin registration server" version="v1beta1" socket="/var/lib/kubelet/device-plugins/kubelet.sock" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.349693 3387 cpu_manager.go:179] "Detected CPU topology" topology={"NumCPUs":24,"NumCores":24,"NumUncoreCache":24,"NumSockets":24,"NumNUMANodes":1,"CPUDetails":{"0":{"NUMANodeID":0,"SocketID":0,"CoreID":0,"UncoreCacheID":0},"1":{"NUMANodeID":0,"SocketID":1,"CoreID":1,"UncoreCacheID":1},"10":{"NUMANodeID":0,"SocketID":10,"CoreID":10,"UncoreCacheID":10},"11":{"NUMANodeID":0,"SocketID":11,"CoreID":11,"UncoreCacheID":11},"12":{"NUMANodeID":0,"SocketID":12,"CoreID":12,"UncoreCacheID":12},"13":{"NUMANodeID":0,"SocketID":13,"CoreID":13,"UncoreCacheID":13},"14":{"NUMANodeID":0,"SocketID":14,"CoreID":14,"UncoreCacheID":14},"15":{"NUMANodeID":0,"SocketID":15,"CoreID":15,"UncoreCacheID":15},"16":{"NUMANodeID":0,"SocketID":16,"CoreID":16,"UncoreCacheID":16},"17":{"NUMANodeID":0,"SocketID":17,"CoreID":17,"UncoreCacheID":17},"18":{"NUMANodeID":0,"SocketID":18,"CoreID":18,"UncoreCacheID":18},"19":{"NUMANodeID":0,"SocketID":19,"CoreID":19,"UncoreCacheID":19},"2":{"NUMANodeID":0,"SocketID":2,"CoreID":2,"UncoreCacheID":2},"20":{"NUMANodeID":0,"SocketID":20,"CoreID":20,"UncoreCacheID":20},"21":{"NUMANodeID":0,"SocketID":21,"CoreID":21,"UncoreCacheID":21},"22":{"NUMANodeID":0,"SocketID":22,"CoreID":22,"UncoreCacheID":22},"23":{"NUMANodeID":0,"SocketID":23,"CoreID":23,"UncoreCacheID":23},"3":{"NUMANodeID":0,"SocketID":3,"CoreID":3,"UncoreCacheID":3},"4":{"NUMANodeID":0,"SocketID":4,"CoreID":4,"UncoreCacheID":4},"5":{"NUMANodeID":0,"SocketID":5,"CoreID":5,"UncoreCacheID":5},"6":{"NUMANodeID":0,"SocketID":6,"CoreID":6,"UncoreCacheID":6},"7":{"NUMANodeID":0,"SocketID":7,"CoreID":7,"UncoreCacheID":7},"8":{"NUMANodeID":0,"SocketID":8,"CoreID":8,"UncoreCacheID":8},"9":{"NUMANodeID":0,"SocketID":9,"CoreID":9,"UncoreCacheID":9}}} Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.349746 3387 policy_static.go:145] "Static policy created with configuration" options={"FullPhysicalCPUsOnly":true,"DistributeCPUsAcrossNUMA":false,"AlignBySocket":false,"DistributeCPUsAcrossCores":false,"StrictCPUReservation":true,"PreferAlignByUncoreCacheOption":false} cpuGroupSize=1 Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.349776 3387 policy_static.go:182] "Reserved CPUs not available for exclusive assignment" reservedSize=4 reserved="0-1,12-13" reservedPhysicalCPUs="0-1,12-13" ... Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.390130 3387 manager.go:324] Recovery completed Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.396295 3387 kubelet_network_linux.go:49] "Initialized iptables rules." protocol="IPv4" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.397904 3387 kubelet_node_status.go:413] "Setting node annotation to enable volume controller attach/detach" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.398420 3387 kubelet_node_status.go:736] "Recording event message for node" node="w0" event="NodeHasSufficientMemory" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.398461 3387 kubelet_node_status.go:736] "Recording event message for node" node="w0" event="NodeHasNoDiskPressure" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.398479 3387 kubelet_node_status.go:736] "Recording event message for node" node="w0" event="NodeHasSufficientPID" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.398981 3387 cpu_manager.go:222] "Starting CPU manager" policy="static" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.398992 3387 cpu_manager.go:223] "Reconciling" reconcilePeriod="5s" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.399007 3387 state_mem.go:36] "Initialized new in-memory state store" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.399127 3387 state_mem.go:88] "Updated default CPUSet" cpuSet="2-11,14-23" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.399139 3387 state_mem.go:96] "Updated CPUSet assignments" assignments={} Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.399153 3387 state_checkpoint.go:136] "State checkpoint: restored state from checkpoint" Mar 03 13:31:15 w0 kubenswrapper[3387]: I0303 13:31:15.399162 3387 state_checkpoint.go:137] "State checkpoint: defaultCPUSet" defaultCpuSet="2-11,14-23" Mar 03 13:31:15 w0 kubenswrapper[3387]: E0303 13:31:15.399196 3387 policy_static.go:195] "Static policy invalid state, please drain node and remove policy state file" err="current set of available CPUs \"0-23\" doesn't match with CPUs in state \"2-11,14-23\"" Mar 03 13:31:15 w0 kubenswrapper[3387]: E0303 13:31:15.399205 3387 cpu_manager.go:239] "Policy start error" err="current set of available CPUs \"0-23\" doesn't match with CPUs in state \"2-11,14-23\"" Mar 03 13:31:15 w0 kubenswrapper[3387]: E0303 13:31:15.399214 3387 kubelet.go:1706] "Failed to start ContainerManager" err="start cpu manager error: current set of available CPUs \"0-23\" doesn't match with CPUs in state \"2-11,14-23\"" Mar 03 13:31:15 w0 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE Mar 03 13:31:15 w0 systemd[1]: kubelet.service: Failed with result 'exit-code'. Mar 03 13:31:25 w0 systemd[1]: kubelet.service: Scheduled restart job, restart counter is at 2. Mar 03 13:31:25 w0 systemd[1]: Stopped Kubernetes Kubelet. Mar 03 13:31:25 w0 systemd[1]: Starting Kubernetes Kubelet... Mar 03 13:31:25 w0 kubenswrapper[3417]: Flag --container-runtime-endpoint has been deprecated, This parameter should be set via the config file specified by the Kubelet's --config flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information. ... etc...
The node does not recover after 30 minutes:
[root@w0 ~]# date Tue Mar 3 14:03:33 UTC 2026 [root@w0 ~]# journalctl -b 0 | grep -c 'Policy start error" err="current set of available CPUs \\"0-23\\"' 186 [root@w0 ~]# journalctl -f ... Mar 03 14:04:14 w0 kubenswrapper[9049]: I0303 14:04:14.428085 9049 cpu_manager.go:222] "Starting CPU manager" policy="static" Mar 03 14:04:14 w0 kubenswrapper[9049]: I0303 14:04:14.428112 9049 cpu_manager.go:223] "Reconciling" reconcilePeriod="5s" Mar 03 14:04:14 w0 kubenswrapper[9049]: I0303 14:04:14.428144 9049 state_mem.go:36] "Initialized new in-memory state store" Mar 03 14:04:14 w0 kubenswrapper[9049]: I0303 14:04:14.428419 9049 state_mem.go:88] "Updated default CPUSet" cpuSet="2-11,14-23" Mar 03 14:04:14 w0 kubenswrapper[9049]: I0303 14:04:14.428447 9049 state_mem.go:96] "Updated CPUSet assignments" assignments={} Mar 03 14:04:14 w0 kubenswrapper[9049]: I0303 14:04:14.428478 9049 state_checkpoint.go:136] "State checkpoint: restored state from checkpoint" Mar 03 14:04:14 w0 kubenswrapper[9049]: I0303 14:04:14.428498 9049 state_checkpoint.go:137] "State checkpoint: defaultCPUSet" defaultCpuSet="2-11,14-23" Mar 03 14:04:14 w0 kubenswrapper[9049]: E0303 14:04:14.428569 9049 policy_static.go:195] "Static policy invalid state, please drain node and remove policy state file" err="current set of available CPUs \"0-23\" doesn't match with CPUs in state \"2-11,14-23\"" Mar 03 14:04:14 w0 kubenswrapper[9049]: E0303 14:04:14.428589 9049 cpu_manager.go:239] "Policy start error" err="current set of available CPUs \"0-23\" doesn't match with CPUs in state \"2-11,14-23\"" Mar 03 14:04:14 w0 kubenswrapper[9049]: E0303 14:04:14.428608 9049 kubelet.go:1706] "Failed to start ContainerManager" err="start cpu manager error: current set of available CPUs \"0-23\" doesn't match with CPUs in state \"2-11,14-23\"" Mar 03 14:04:14 w0 systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE Mar 03 14:04:14 w0 systemd[1]: kubelet.service: Failed with result 'exit-code'.