-
Bug
-
Resolution: Won't Do
-
Undefined
-
None
-
4.11.z
-
None
-
False
-
-
Description of problem:
After multiple soft reboots of a SNO clusters some of the user workload statefulset pods containers do not start: probes report exec failed: unable to start container process: error adding pid 876035 to cgroups: failed to write 876035: open /sys/fs/cgroup/systemd/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod0006f2bb_8ec7_4d23_b3ee_f41f2139099a.slice/crio-3423eeabbeb52cba4c7eaa4c91fafe593a5453c48b096e154a0edacbdfa133c8.scope/cgroup.procs: no such file or directory
Version-Release number of selected component (if applicable):
4.11.7
How reproducible:
Infrequent
Steps to Reproduce:
1. Deploy SNO cluster with Telco DU profile applied 2. Create user workload 3. Trigger a soft reboot via `reboot` command 4. Wait for the node to recover 5. Validate all the workload resource recovered correctly
Actual results:
One of the statefulset's pods containers does not start
Expected results:
All workload resources recover successfully
Additional info:
Attaching must-gather and sosreport and the output of `oc describe/get pods` After deleting/re-creating the pod all containers start successfully.