Description of problem:
We can observe some laziness removing a pod after its deletion.
We are able to observe and reproduce it (almost constantly) for virt-launcher pods executing the VirtualMachineInstances of CNV and pods of a stateful set of another application but we are still not able to produce a minimal reproducer involving just a plain pod.
In both the aforementioned cases, the involved pods are defined with terminationGracePeriodSeconds: 30 so we expect then to disappear in a comparable amount of time while it takes more then two minutes for the pods to disappear.
Observing it on a cluster, we can certainly state that all the containers part of the involved pods are terminating in a matter of seconds on the node.
We clearly see it with crioctl inspecting the container list or with `ps` looking for CNV virt-launcher processes on the node.
On the kubelet logs on the node we see:
sh-5.1# journalctl -u kubelet | grep ba7b7c10940f24c6fa4a83632f435558e8cf940bc89097855622dd2cc4bef09a Jan 16 10:08:05 dev-simone-5-sxs6r-worker-0-vxkwx kubenswrapper[3800]: I0116 10:08:05.124524 3800 kubelet.go:2435] "SyncLoop (PLEG): event for pod" pod="simone/virt-launcher-testvmi-broken-bbnhs" event={"ID":"c5041fb6-6362-4161-b325-ee30c9c0a227","Type":"ContainerStarted","Data":"ba7b7c10940f24c6fa4a83632f435558e8cf940bc89097855622dd2cc4bef09a"} Jan 16 10:13:54 dev-simone-5-sxs6r-worker-0-vxkwx kubenswrapper[3800]: I0116 10:13:54.254656 3800 kuberuntime_container.go:745] "Killing container with a grace period" pod="simone/virt-launcher-testvmi-broken-bbnhs" podUID="c5041fb6-6362-4161-b325-ee30c9c0a227" containerName="guest-console-log" containerID="cri-o://ba7b7c10940f24c6fa4a83632f435558e8cf940bc89097855622dd2cc4bef09a" gracePeriod=30 Jan 16 10:13:54 dev-simone-5-sxs6r-worker-0-vxkwx kubenswrapper[3800]: I0116 10:13:54.682743 3800 generic.go:334] "Generic (PLEG): container finished" podID="c5041fb6-6362-4161-b325-ee30c9c0a227" containerID="ba7b7c10940f24c6fa4a83632f435558e8cf940bc89097855622dd2cc4bef09a" exitCode=0 Jan 16 10:13:54 dev-simone-5-sxs6r-worker-0-vxkwx kubenswrapper[3800]: I0116 10:13:54.682798 3800 kubelet.go:2435] "SyncLoop (PLEG): event for pod" pod="simone/virt-launcher-testvmi-broken-bbnhs" event={"ID":"c5041fb6-6362-4161-b325-ee30c9c0a227","Type":"ContainerDied","Data":"ba7b7c10940f24c6fa4a83632f435558e8cf940bc89097855622dd2cc4bef09a"} Jan 16 10:16:24 dev-simone-5-sxs6r-worker-0-vxkwx kubenswrapper[3800]: E0116 10:16:24.254959 3800 remote_runtime.go:366] "StopContainer from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" containerID="ba7b7c10940f24c6fa4a83632f435558e8cf940bc89097855622dd2cc4bef09a" Jan 16 10:16:24 dev-simone-5-sxs6r-worker-0-vxkwx kubenswrapper[3800]: E0116 10:16:24.255050 3800 kuberuntime_container.go:750] "Container termination failed with gracePeriod" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" pod="simone/virt-launcher-testvmi-broken-bbnhs" podUID="c5041fb6-6362-4161-b325-ee30c9c0a227" containerName="guest-console-log" containerID="cri-o://ba7b7c10940f24c6fa4a83632f435558e8cf940bc89097855622dd2cc4bef09a" gracePeriod=30 Jan 16 10:16:24 dev-simone-5-sxs6r-worker-0-vxkwx kubenswrapper[3800]: E0116 10:16:24.255081 3800 kuberuntime_container.go:775] "Kill container failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" pod="simone/virt-launcher-testvmi-broken-bbnhs" podUID="c5041fb6-6362-4161-b325-ee30c9c0a227" containerName="guest-console-log" containerID={"Type":"cri-o","ID":"ba7b7c10940f24c6fa4a83632f435558e8cf940bc89097855622dd2cc4bef09a"} Jan 16 10:16:48 dev-simone-5-sxs6r-worker-0-vxkwx kubenswrapper[3800]: I0116 10:16:48.349156 3800 scope.go:117] "RemoveContainer" containerID="ba7b7c10940f24c6fa4a83632f435558e8cf940bc89097855622dd2cc4bef09a"
So the kubelet correctly received the ContainerDied event at 10:13:54.682798 for container ba7b7c10940f24c6fa4a83632f435558e8cf940bc89097855622dd2cc4bef09a but for some reason it ignored it until 10:16:24.254959 where it reported "StopContainer from runtime service failed" err="rpc error: code = DeadlineExceeded desc = context deadline exceeded" for the same container.
The issue is not fully systematic (we suspect a race condition).
When we fail to reproduce it for the same pod we see something like:
sh-5.1# journalctl -u kubelet | grep b712045c4b544 Jan 16 10:50:47 dev-simone-5-sxs6r-worker-0-72d7p kubenswrapper[3813]: I0116 10:50:47.930634 3813 kubelet.go:2435] "SyncLoop (PLEG): event for pod" pod="simone/virt-launcher-testvmi-fixed-gxjbs" event={"ID":"199cabee-1c23-4346-aab0-42080882d418","Type":"ContainerStarted","Data":"b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786"} Jan 16 10:52:41 dev-simone-5-sxs6r-worker-0-72d7p kubenswrapper[3813]: I0116 10:52:41.780159 3813 kuberuntime_container.go:745] "Killing container with a grace period" pod="simone/virt-launcher-testvmi-fixed-gxjbs" podUID="199cabee-1c23-4346-aab0-42080882d418" containerName="guest-console-log" containerID="cri-o://b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786" gracePeriod=33 Jan 16 10:52:42 dev-simone-5-sxs6r-worker-0-72d7p kubenswrapper[3813]: I0116 10:52:42.442523 3813 generic.go:334] "Generic (PLEG): container finished" podID="199cabee-1c23-4346-aab0-42080882d418" containerID="b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786" exitCode=2 Jan 16 10:52:42 dev-simone-5-sxs6r-worker-0-72d7p kubenswrapper[3813]: I0116 10:52:42.442600 3813 kubelet.go:2435] "SyncLoop (PLEG): event for pod" pod="simone/virt-launcher-testvmi-fixed-gxjbs" event={"ID":"199cabee-1c23-4346-aab0-42080882d418","Type":"ContainerDied","Data":"b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786"} Jan 16 10:52:48 dev-simone-5-sxs6r-worker-0-72d7p kubenswrapper[3813]: I0116 10:52:48.473995 3813 scope.go:117] "RemoveContainer" containerID="b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786" Jan 16 10:52:48 dev-simone-5-sxs6r-worker-0-72d7p kubenswrapper[3813]: I0116 10:52:48.736531 3813 scope.go:117] "RemoveContainer" containerID="b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786" Jan 16 10:52:48 dev-simone-5-sxs6r-worker-0-72d7p kubenswrapper[3813]: E0116 10:52:48.737060 3813 remote_runtime.go:432] "ContainerStatus from runtime service failed" err="rpc error: code = NotFound desc = could not find container \"b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786\": container with ID starting with b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786 not found: ID does not exist" containerID="b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786" Jan 16 10:52:48 dev-simone-5-sxs6r-worker-0-72d7p kubenswrapper[3813]: I0116 10:52:48.737121 3813 pod_container_deletor.go:53] "DeleteContainer returned error" containerID={"Type":"cri-o","ID":"b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786"} err="failed to get container status \"b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786\": rpc error: code = NotFound desc = could not find container \"b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786\": container with ID starting with b712045c4b544fd7b005e94a251ade5ce6b2b097427b8e3d0e44f5990a71d786 not found: ID does not exist"
So, for some reason, when we are able to hit the bug we do not see the RemoveContainer statement in the kubelet logs that is instead there when we fail reproducing it.
Version-Release number of selected component (if applicable):
CRI-O 1.28, CRI-O 1.27.2 (but not on CRI-O <= 1.27.1!), CRI-O 1.25.4 (but not on previous CRI-O). Doing some bisection we think that https://github.com/cri-o/cri-o/pull/7129 is a good candidate being in all the affected versions.
How reproducible:
pretty high with specific pods (managed by CNV vmi or by a statefulset) but not systematic
Steps to Reproduce:
1. still not 100% clear with a minimal reproducer pod but we can prepare a cluster and the instructions on how to reproduce it 2. 3.
Actual results:
All the containers and all the expected process are terminated but the pod is stuck in terminating for more than 2 minutes.
Expected results:
The pod is quickly removed when all of its containers are terminated
Additional info:
- clones
-
OCPBUGS-27949 Lazy pod removal with recent CRI-O releases
- Closed
- is cloned by
-
OCPBUGS-29938 [4.14] Lazy pod removal with recent CRI-O releases
- Closed
- links to
-
RHSA-2023:7198 OpenShift Container Platform 4.15 security update