Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-11409

Kubelet failing to start with invalid kernel flag values for vm.overcommit_memory & kernel.panic on RHEL node

XMLWordPrintable

    • Important
    • No
    • Sprint 235
    • 1
    • Rejected
    • False
    • Hide

      Blocking upgrade test from 4.12 to 4.13 with RHEL worker

      Show
      Blocking upgrade test from 4.12 to 4.13 with RHEL worker

      Description of problem:

      Kubelet is failing to start with invalid kernel flag values for vm.overcommit_memory & kernel.panic values on RHEL node
      
      kubelet.go:1413] "Failed to start ContainerManager" err="[invalid kernel flag: vm/overcommit_memory, expected value: 1, actual value: 0, invalid kernel flag: kernel/panic, expected value: 10, actual value: 0]"
      
      % oc get clusterversion                                           
      NAME      VERSION                              AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.13.0-0.nightly-2023-04-01-062001   True        False         9h      Error while reconciling 4.13.0-0.nightly-2023-04-01-062001: the cluster operator machine-config is not available
      
      sh-4.4# systemctl status kubelet
      ● kubelet.service - Kubernetes Kubelet
         Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
        Drop-In: /etc/systemd/system/kubelet.service.d
                 └─01-kubens.conf, 10-mco-default-madv.conf, 20-aws-node-name.conf, 20-aws-providerid.conf, 20-logging.conf
         Active: active (running) since Tue 2023-04-04 16:26:47 UTC; 11h ago
       Main PID: 1728 (kubelet)
          Tasks: 21 (limit: 100788)
         Memory: 262.4M
            CPU: 47min 43.245s
         CGroup: /system.slice/kubelet.service
                 └─1728 /usr/bin/kubelet --config=/etc/kubernetes/kubelet.conf --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig --kubeconfig=/var/lib/kubelet/kubeconfig --container-runtime=remote --container-runtime-endpoint=/var/run/crio/crio.sock --runtime-cgroups=/system.slice/crio.service --node->Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.155111    1728 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"host\" (UniqueName: \"kubernetes.io/host-path/64a24c0a-54a1-416b-966a-b1b03e86fdc8-host\") pod \"ip>
      Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.155188    1728 reconciler.go:357] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-api-access-pqk8q\" (UniqueName: \"kubernetes.io/projected/64a24c0a-54a1-416b-966a-b1b03e86fdc8>
      Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.256159    1728 reconciler.go:269] "operationExecutor.MountVolume started for volume \"host\" (UniqueName: \"kubernetes.io/host-path/64a24c0a-54a1-416b-966a-b1b03e86fdc8-host\") pod \"ip-10-0-52-4us-east-2>
      Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.256233    1728 reconciler.go:269] "operationExecutor.MountVolume started for volume \"kube-api-access-pqk8q\" (UniqueName: \"kubernetes.io/projected/64a24c0a-54a1-416b-966a-b1b03e86fdc8-kube-api-access-pq>
      Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.256465    1728 operation_generator.go:730] "MountVolume.SetUp succeeded for volume \"host\" (UniqueName: \"kubernetes.io/host-path/64a24c0a-54a1-416b-966a-b1b03e86fdc8-host\") pod \"ip-10-0-52-4us-east-2c>
      Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.274530    1728 operation_generator.go:730] "MountVolume.SetUp succeeded for volume \"kube-api-access-pqk8q\" (UniqueName: \"kubernetes.io/projected/64a24c0a-54a1-416b-966a-b1b03e86fdc8-kube-api-access-pqk>
      Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.310478    1728 util.go:30] "No sandbox for pod can be found. Need to start a new one" pod="openshift-debug-szsww/ip-10-0-52-4us-east-2computeinternal-debug"
      Apr 05 04:01:01 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:01.323975    1728 provider.go:102] Refreshing cache for provider: *credentialprovider.defaultDockerConfigProvider
      Apr 05 04:01:02 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:02.053668    1728 kubelet.go:2157] "SyncLoop (PLEG): event for pod" pod="openshift-debug-szsww/ip-10-0-52-4us-east-2computeinternal-debug" event=&{ID:64a24c0a-54a1-416b-966a-b1b03e86fdc8 Type:ContainerStarte>
      Apr 05 04:01:10 ip-10-0-52-4.us-east-2.compute.internal kubenswrapper[1728]: I0405 04:01:10.078456    1728 kubelet.go:2157] "SyncLoop (PLEG): event for pod" pod="openshift-debug-szsww/ip-10-0-52-4us-east-2computeinternal-debug" event=&{ID:64a24c0a-54a1-416b-966a-b1b03e86fdc8 Type:ContainerStarte>
      
      
      From kubelet journal logs:
      [...]
      Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.826135  150140 cpu_manager.go:214] "Starting CPU manager" policy="none"
      Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.826168  150140 cpu_manager.go:215] "Reconciling" reconcilePeriod="10s"
      Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.826195  150140 state_mem.go:36] "Initialized new in-memory state store"
      Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.827970  150140 policy_none.go:49] "None policy: Start"
      Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.828607  150140 memory_manager.go:168] "Starting memorymanager" policy="None"
      Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: I0405 06:11:45.828639  150140 state_mem.go:35] "Initializing new in-memory state store"
      Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal kubenswrapper[150140]: E0405 06:11:45.830710  150140 kubelet.go:1413] "Failed to start ContainerManager" err="[invalid kernel flag: vm/overcommit_memory, expected value: 1, actual value: 0, invalid kernel flag: kernel/panic, expected value: 10, actual value: 0]"
      Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
      Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal systemd[1]: kubelet.service: Failed with result 'exit-code'.
      Apr 05 06:11:45 ip-10-0-53-130.us-east-2.compute.internal systemd[1]: kubelet.service: Consumed 572ms CPU time
      [...]
      
      Checking sysctl on node, I see both parameters set to 0
      
      [root@ip-10-0-53-130 ~]# sysctl -a | grep -e kernel.panic -e vm.overcommit_memory
      kernel.panic = 0
      vm.overcommit_memory = 0
      
      While on other node, they are non zero
      
      sh-5.1# sysctl -a | grep -e kernel.panic -e vm.overcommit_memory
      kernel.panic = 10
      vm.overcommit_memory = 1
      
      
      % oc get nodes -o wide                                   
      NAME                                        STATUS                        ROLES                  AGE   VERSION           INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                                        KERNEL-VERSION                 CONTAINER-RUNTIME
      ip-10-0-48-126.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.48.126   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-48-213.us-east-2.compute.internal   Ready                         control-plane,master   17h   v1.26.2+7195e44   10.0.48.213   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-49-194.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.49.194   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-49-198.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.49.198   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-49-26.us-east-2.compute.internal    Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.49.26    <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-50-122.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.50.122   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-50-16.us-east-2.compute.internal    Ready                         worker                 17h   v1.26.2+7195e44   10.0.50.16    <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-51-144.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.51.144   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-51-189.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.51.189   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-51-35.us-east-2.compute.internal    Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.51.35    <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-52-195.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.52.195   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-52-254.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.52.254   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-52-4.us-east-2.compute.internal     Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.52.4     <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-53-130.us-east-2.compute.internal   NotReady,SchedulingDisabled   worker                 16h   v1.25.7+eab9cc9   10.0.53.130   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-53-216.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.53.216   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-53-39.us-east-2.compute.internal    Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.53.39    <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-54-36.us-east-2.compute.internal    Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.54.36    <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-54-53.us-east-2.compute.internal    Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.54.53    <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-55-134.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.55.134   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-55-177.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.55.177   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-55-78.us-east-2.compute.internal    Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.55.78    <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-55-97.us-east-2.compute.internal    Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.55.97    <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-56-176.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.56.176   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-56-188.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.56.188   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-57-231.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.57.231   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-57-28.us-east-2.compute.internal    Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.57.28    <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-57-8.us-east-2.compute.internal     Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.57.8     <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-58-12.us-east-2.compute.internal    Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.58.12    <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-58-131.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.58.131   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-59-205.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.59.205   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-59-37.us-east-2.compute.internal    Ready                         worker                 17h   v1.26.2+7195e44   10.0.59.37    <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-60-143.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.60.143   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-60-164.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.60.164   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-60-177.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.60.177   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-60-180.us-east-2.compute.internal   Ready                         worker                 17h   v1.26.2+7195e44   10.0.60.180   <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-60-39.us-east-2.compute.internal    Ready                         control-plane,master   17h   v1.26.2+7195e44   10.0.60.39    <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-61-144.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.61.144   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-61-192.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.61.192   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-62-119.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.62.119   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-63-244.us-east-2.compute.internal   Ready                         worker                 16h   v1.25.7+eab9cc9   10.0.63.244   <none>        Red Hat Enterprise Linux 8.6 (Ootpa)                            4.18.0-425.19.2.el8_7.x86_64   cri-o://1.25.2-14.rhaos4.12.git3e4b64e.el8
      ip-10-0-64-148.us-east-2.compute.internal   Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.64.148   <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-64-162.us-east-2.compute.internal   Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.64.162   <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-65-47.us-east-2.compute.internal    Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.65.47    <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-66-11.us-east-2.compute.internal    Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.66.11    <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-66-9.us-east-2.compute.internal     Ready                         control-plane,master   17h   v1.26.2+7195e44   10.0.66.9     <none>        Red Hat Enterprise Linux CoreOS 413.92.202303310708-0 (Plow)    5.14.0-284.4.1.el9_2.x86_64    cri-o://1.26.2-7.rhaos4.13.gitc0557b8.el9
      ip-10-0-69-71.us-east-2.compute.internal    Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.69.71    <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-70-107.us-east-2.compute.internal   Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.70.107   <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-72-218.us-east-2.compute.internal   Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.72.218   <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-74-236.us-east-2.compute.internal   Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.74.236   <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-75-17.us-east-2.compute.internal    Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.75.17    <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-76-59.us-east-2.compute.internal    Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.76.59    <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-79-214.us-east-2.compute.internal   Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.79.214   <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8
      ip-10-0-79-67.us-east-2.compute.internal    Ready                         worker                 17h   v1.25.7+eab9cc9   10.0.79.67    <none>        Red Hat Enterprise Linux CoreOS 412.86.202303241612-0 (Ootpa)   4.18.0-372.49.1.el8_6.x86_64   cri-o://1.25.2-10.rhaos4.12.git0a083f9.el8 
      
      % oc get mcp                                                     
      NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
      master   rendered-master-fc0c168d007f0af5e0aa06a0470ca224   True      False      False      3              3                   3                     0                      17h
      worker   rendered-worker-a3bf68ff405d229d33ebcd0547efc4dd   False     True       False      50             13                  13                    0                      17h
      
      % oc get mc 
      NAME                                               GENERATEDBYCONTROLLER                      IGNITIONVERSION   AGE
      00-master                                          428c2e0655c6a5fc876dad8864a5451142b5c3e2   3.2.0             17h
      00-worker                                          428c2e0655c6a5fc876dad8864a5451142b5c3e2   3.2.0             17h
      01-master-container-runtime                        428c2e0655c6a5fc876dad8864a5451142b5c3e2   3.2.0             17h
      01-master-kubelet                                  428c2e0655c6a5fc876dad8864a5451142b5c3e2   3.2.0             17h
      01-worker-container-runtime                        428c2e0655c6a5fc876dad8864a5451142b5c3e2   3.2.0             17h
      01-worker-kubelet                                  428c2e0655c6a5fc876dad8864a5451142b5c3e2   3.2.0             17h
      99-master-generated-registries                     428c2e0655c6a5fc876dad8864a5451142b5c3e2   3.2.0             17h
      99-master-ssh                                                                                 3.2.0             17h
      99-worker-generated-registries                     428c2e0655c6a5fc876dad8864a5451142b5c3e2   3.2.0             17h
      99-worker-ssh                                                                                 3.2.0             17h
      rendered-master-c4997ff8d011dc4224b5276dfbb95595   52fe26136643a946ff1dd1307012cbdef31ebf97   3.2.0             17h
      rendered-master-fc0c168d007f0af5e0aa06a0470ca224   428c2e0655c6a5fc876dad8864a5451142b5c3e2   3.2.0             10h
      rendered-worker-38306710c3ed4295bfd128b81ec35da1   428c2e0655c6a5fc876dad8864a5451142b5c3e2   3.2.0             10h
      rendered-worker-a3bf68ff405d229d33ebcd0547efc4dd   52fe26136643a946ff1dd1307012cbdef31ebf97   3.2.0             17h
      
      % oc describe mc rendered-worker-38306710c3ed4295bfd128b81ec35da1
      [...]
        Extensions:
        Fips:  false
        Kernel Arguments:
          systemd.unified_cgroup_hierarchy=0
          systemd.legacy_systemd_cgroup_controller=1
        Kernel Type:   default
        Os Image URL:  quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:17537cfa431a6285bcdf1123a15df58ce36bc6acd9256659b78117ea073a738f
      Events:          <none>

      Version-Release number of selected component (if applicable):

      4.13.0-0.nightly-2023-04-01-062001

      How reproducible:

      Haven't been able to reproduce yet. I have seen this issue twice on a RHEL node

      Steps to Reproduce:

      1. Install a cluster of AWS - OVN - Customer VPC with 3 master nodes, 25 RHCOS worker nodes and 25 RHEL 8.6 worker nodes. Build is 4.12.10-x86_64.
      2. Upgrade the cluster to 4.13 build 4.13.0-0.nightly-2023-04-01-062001. The upgrade failed, as it had 4 RHEL workers in the Notready state, they were not able to start kubelet.

      Actual results:

       

      Expected results:

       

      Additional info:

       

            rh-ee-bbarbach Brent Barbachem
            schoudha Sunil Choudhary
            Sunil Choudhary Sunil Choudhary
            Votes:
            0 Vote for this issue
            Watchers:
            14 Start watching this issue

              Created:
              Updated:
              Resolved: