-
Bug
-
Resolution: Duplicate
-
Critical
-
None
-
4.13
-
Important
-
No
-
Sprint 235
-
1
-
Rejected
-
False
-
Description of problem:
Kubelet is failing to start with invalid kernel flag values for vm.overcommit_memory & kernel.panic values on RHEL node kubelet.go:1413] "Failed to start ContainerManager" err="[invalid kernel flag: vm/overcommit_memory, expected value: 1, actual value: 0, invalid kernel flag: kernel/panic, expected value: 10, actual value: 0]" % oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.0-0.nightly-2023-04-01-062001 True False 9h Error while reconciling 4.13.0-0.nightly-2023-04-01-062001: the cluster operator machine-config is not available sh-4.4# systemctl status kubelet ● kubelet.service - Kubernetes Kubelet Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/kubelet.service.d └─01-kubens.conf, 10-mco-default-madv.conf, 20-aws-node-name.conf, 20-aws-providerid.conf, 20-logging.conf Active: active (running) since Tue 2023-04-04 16:26:47 UTC; 11h ago Main PID: 1728 (kubelet) Tasks: 21 (limit: 100788) Memory: 262.4M CPU: 47min 43.245s CGroup: /system.slice/kubelet.service └─1728 /usr/bin/kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig --kubeconfig=/var/lib/kubelet/kubeconfig --container-runtime=remote --container-runtime-endpoint=/var/run/crio/crio.sock --runtime-cgroups=/system.slice/crio.service --node->Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.155111 1728 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"host\" (UniqueName: \"kubernetes.io/host-path/64a24c0a-54a1-416b-966a-b1b03e86fdc8-host\") pod \"ip> Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.155188 1728 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-pqk8q\" (UniqueName: \"kubernetes.io/projected/64a24c0a-54a1-416b-966a-b1b03e86fdc8> Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.256159 1728 reconciler.go:269] "operationExecutor.MountVolume started for volume \"host\" (UniqueName: \"kubernetes.io/host-path/64a24c0a-54a1-416b-966a-b1b03e86fdc8-host\") pod \"ip-10-0-52-4us-east-2> Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.256233 1728 reconciler.go:269] "operationExecutor.MountVolume started for volume \"kube-api-access-pqk8q\" (UniqueName: \"kubernetes.io/projected/64a24c0a-54a1-416b-966a-b1b03e86fdc8-kube-api-access-pq> Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.256465 1728 operation_generator.go:730] "MountVolume.SetUp succeeded for volume \"host\" (UniqueName: \"kubernetes.io/host-path/64a24c0a-54a1-416b-966a-b1b03e86fdc8-host\") pod \"ip-10-0-52-4us-east-2c> Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.274530 1728 operation_generator.go:730] "MountVolume.SetUp succeeded for volume \"kube-api-access-pqk8q\" (UniqueName: \"kubernetes.io/projected/64a24c0a-54a1-416b-966a-b1b03e86fdc8-kube-api-access-pqk> Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.310478 1728 util.go:30] "No sandbox for pod can be found. Need to start a new one" pod="openshift-debug-szsww/ip-10-0-52-4us-east-2computeinternal-debug" Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.323975 1728 provider.go:102] Refreshing cache for provider: *credentialprovider.defaultDockerConfigProvider Apr 05 04:01:02 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:02.053668 1728 kubelet.go:2157] "SyncLoop (PLEG): event for pod" pod="openshift-debug-szsww/ip-10-0-52-4us-east-2computeinternal-debug" event=&{ID:64a24c0a-54a1-416b-966a-b1b03e86fdc8 Type:ContainerStarte> Apr 05 04:01:10 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:10.078456 1728 kubelet.go:2157] "SyncLoop (PLEG): event for pod" pod="openshift-debug-szsww/ip-10-0-52-4us-east-2computeinternal-debug" event=&{ID:64a24c0a-54a1-416b-966a-b1b03e86fdc8 Type:ContainerStarte> From kubelet journal logs: [...] Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.826135 150140 cpu_manager.go:214] "Starting CPU manager" policy="none" Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.826168 150140 cpu_manager.go:215] "Reconciling" reconcilePeriod="10s" Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.826195 150140 state_mem.go:36] "Initialized new in-memory state store" Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.827970 150140 policy_none.go:49] "None policy: Start" Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.828607 150140 memory_manager.go:168] "Starting memorymanager" policy="None" Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.828639 150140 state_mem.go:35] "Initializing new in-memory state store" Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: E0405 06:11:45.830710 150140 kubelet.go:1413] "Failed to start ContainerManager" err="[invalid kernel flag: vm/overcommit_memory, expected value: 1, actual value: 0, invalid kernel flag: kernel/panic, expected value: 10, actual value: 0]" Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal systemd[1]: kubelet.service: Failed with result 'exit-code'. Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal systemd[1]: kubelet.service: Consumed 572ms CPU time [...] Checking sysctl on node, I see both parameters set to 0 [root@ip-10-0-53-130 ~]# sysctl -a | grep -e kernel.panic -e vm.overcommit_memory kernel.panic = 0 vm.overcommit_memory = 0 While on other node, they are non zero sh-5.1# sysctl -a | grep -e kernel.panic -e vm.overcommit_memory kernel.panic = 10 vm.overcommit_memory = 1 % oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME ip-10-0-48-126.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.48.126 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-48-213.us-east-2.compute.internal Ready control-plane,master 17h v1.26.2+7195e44 10.0.48.213 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-49-194.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.49.194 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-49-198.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.49.198 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-49-26.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.49.26 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-50-122.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.50.122 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-50-16.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.50.16 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-51-144.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.51.144 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-51-189.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.51.189 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-51-35.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.51.35 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-52-195.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.52.195 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-52-254.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.52.254 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-52-4.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.52.4 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-53-130.us-east-2.compute.internal NotReady,SchedulingDisabled worker 16h v1.25.7+eab9cc9 10.0.53.130 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-53-216.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.53.216 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-53-39.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.53.39 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-54-36.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.54.36 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-54-53.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.54.53 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-55-134.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.55.134 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-55-177.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.55.177 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-55-78.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.55.78 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-55-97.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.55.97 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-56-176.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.56.176 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-56-188.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.56.188 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-57-231.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.57.231 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-57-28.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.57.28 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-57-8.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.57.8 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-58-12.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.58.12 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-58-131.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.58.131 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-59-205.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.59.205 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-59-37.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.59.37 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-60-143.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.60.143 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-60-164.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.60.164 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-60-177.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.60.177 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-60-180.us-east-2.compute.internal Ready worker 17h v1.26.2+7195e44 10.0.60.180 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-60-39.us-east-2.compute.internal Ready control-plane,master 17h v1.26.2+7195e44 10.0.60.39 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-61-144.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.61.144 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-61-192.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.61.192 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-62-119.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.62.119 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-63-244.us-east-2.compute.internal Ready worker 16h v1.25.7+eab9cc9 10.0.63.244 <none> Red Hat Enterprise Linux 8.6 (Ootpa) 4.18.0-425.19.2.el8_7.x86_64 cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8 ip-10-0-64-148.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.64.148 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-64-162.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.64.162 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-65-47.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.65.47 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-66-11.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.66.11 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-66-9.us-east-2.compute.internal Ready control-plane,master 17h v1.26.2+7195e44 10.0.66.9 <none> Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow) 5.14.0-284.4.1.el9_2.x86_64 cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9 ip-10-0-69-71.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.69.71 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-70-107.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.70.107 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-72-218.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.72.218 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-74-236.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.74.236 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-75-17.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.75.17 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-76-59.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.76.59 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-79-214.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.79.214 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 ip-10-0-79-67.us-east-2.compute.internal Ready worker 17h v1.25.7+eab9cc9 10.0.79.67 <none> Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa) 4.18.0-372.49.1.el8_6.x86_64 cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 % oc get mcp NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE master rendered-master-fc0c168d007f0af5e0aa06a0470ca224 True False False 3 3 3 0 17h worker rendered-worker-a3bf68ff405d229d33ebcd0547efc4dd False True False 50 13 13 0 17h % oc get mc NAME GENERATEDBYCONTROLLER IGNITIONVERSION AGE 00-master 428c2e0655c6a5fc876dad8864a5451142b5c3e2 3.2.0 17h 00-worker 428c2e0655c6a5fc876dad8864a5451142b5c3e2 3.2.0 17h 01-master-container-runtime 428c2e0655c6a5fc876dad8864a5451142b5c3e2 3.2.0 17h 01-master-kubelet 428c2e0655c6a5fc876dad8864a5451142b5c3e2 3.2.0 17h 01-worker-container-runtime 428c2e0655c6a5fc876dad8864a5451142b5c3e2 3.2.0 17h 01-worker-kubelet 428c2e0655c6a5fc876dad8864a5451142b5c3e2 3.2.0 17h 99-master-generated-registries 428c2e0655c6a5fc876dad8864a5451142b5c3e2 3.2.0 17h 99-master-ssh 3.2.0 17h 99-worker-generated-registries 428c2e0655c6a5fc876dad8864a5451142b5c3e2 3.2.0 17h 99-worker-ssh 3.2.0 17h rendered-master-c4997ff8d011dc4224b5276dfbb95595 52fe26136643a946ff1dd1307012cbdef31ebf97 3.2.0 17h rendered-master-fc0c168d007f0af5e0aa06a0470ca224 428c2e0655c6a5fc876dad8864a5451142b5c3e2 3.2.0 10h rendered-worker-38306710c3ed4295bfd128b81ec35da1 428c2e0655c6a5fc876dad8864a5451142b5c3e2 3.2.0 10h rendered-worker-a3bf68ff405d229d33ebcd0547efc4dd 52fe26136643a946ff1dd1307012cbdef31ebf97 3.2.0 17h % oc describe mc rendered-worker-38306710c3ed4295bfd128b81ec35da1 [...] Extensions: Fips: false Kernel Arguments: systemd.unified_cgroup_hierarchy=0 systemd.legacy_systemd_cgroup_controller=1 Kernel Type: default Os Image URL: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:17537cfa431a6285bcdf1123a15df58ce36bc6acd9256659b78117ea073a738f Events: <none>
Version-Release number of selected component (if applicable):
4.13.0-0.nightly-2023-04-01-062001
How reproducible:
Haven't been able to reproduce yet. I have seen this issue twice on a RHEL node
Steps to Reproduce:
1. Install a cluster of AWS - OVN - Customer VPC with 3 master nodes, 25 RHCOS worker nodes and 25 RHEL 8.6 worker nodes. Build is 4.12.10-x86_64. 2. Upgrade the cluster to 4.13 build 4.13.0-0.nightly-2023-04-01-062001. The upgrade failed, as it had 4 RHEL workers in the Notready state, they were not able to start kubelet.
Actual results:
Expected results:
Additional info:
- duplicates
-
OCPBUGS-7589 Enable default sysctls for kubelet
- Closed
- is blocked by
-
OCPNODE-1619 Impact: Kubelet failing to start with invalid kernel flag values for vm.overcommit_memory & kernel.panic on RHEL node
- Closed
- links to