Description of problem:
After a manual crash of a OCP node the OSPD VM running on the OCP node is stuck in terminating state
Version-Release number of selected component (if applicable):
OCP 4.12.15 osp-director-operator.v1.3.0 kubevirt-hyperconverged-operator.v4.12.5
How reproducible:
Login to a OCP 4.12.15 Node running a VM Manually crash the master node. After reboot the VM stay in terminating state
Steps to Reproduce:
1. ssh core@masterX 2. sudo su 3. echo c > /proc/sysrq-trigger
Actual results:
After reboot the VM stay in terminating state $ omc get node|sed -e 's/modl4osp03ctl/model/g' | sed -e 's/telecom.tcnz.net/aaa.bbb.ccc/g' NAME STATUS ROLES AGE VERSION model01.aaa.bbb.ccc Ready control-plane,master,worker 91d v1.25.8+37a9a08 model02.aaa.bbb.ccc Ready control-plane,master,worker 91d v1.25.8+37a9a08 model03.aaa.bbb.ccc Ready control-plane,master,worker 91d v1.25.8+37a9a08 $ omc get pod -n openstack NAME READY STATUS RESTARTS AGE openstack-provision-server-7b79fcc4bd-x8kkz 2/2 Running 0 8h openstackclient 1/1 Running 0 7h osp-director-operator-controller-manager-5896b5766b-sc7vm 2/2 Running 0 8h osp-director-operator-index-qxxvw 1/1 Running 0 8h virt-launcher-controller-0-9xpj7 1/1 Running 0 20d virt-launcher-controller-1-5hj9x 1/1 Running 0 20d virt-launcher-controller-2-vhd69 0/1 NodeAffinity 0 43d $ omc describe pod virt-launcher-controller-2-vhd69 |grep Status: Status: Terminating (lasts 37h) $ xsos sosreport-xxxx/|grep time ... Boot time: Wed Nov 22 01:44:11 AM UTC 2023 Uptime: 8:27, 0 users
Expected results:
VM restart automatically OR does not stay in Terminating state
Additional info:
The issue has been seen two time. First time, a crash of the kernel occured and we had the associated VM on the node in terminating state Second time we try to reproduce the issue by crashing manually the kernel and we got the same result. The VM running on the OCP node stay in terminating state
- blocks
-
OCPBUGS-25813 [OCP 4.14] VM stuck in terminating state after OCP node crash
- Closed
- clones
-
OCPBUGS-23896 [OCP 4.16] VM stuck in terminating state after OCP node crash
- Closed
- is blocked by
-
OCPBUGS-23896 [OCP 4.16] VM stuck in terminating state after OCP node crash
- Closed
- is cloned by
-
OCPBUGS-25813 [OCP 4.14] VM stuck in terminating state after OCP node crash
- Closed
- links to
-
RHSA-2023:7198 OpenShift Container Platform 4.15 security update