-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
None
-
False
-
-
False
-
CLOSED
-
---
-
---
-
-
High
-
No
Description of problem:
A running vm shuts down after running for several minutes with a reported SIGTERM being sent to all processes
Version-Release number of selected component (if applicable):
oc get csv -n openshift-cnv
NAME DISPLAY VERSION REPLACES PHASE
kubevirt-hyperconverged-operator.4.14.0-1876 OpenShift Virtualization 4.14.0-1876 kubevirt-hyperconverged-operator.4.14.0-1867 Succeeded
odr-cluster-operator.v4.14.0-123.stable Openshift DR Cluster Operator 4.14.0-123.stable odr-cluster-operator.v4.14.0-117.stable Succeeded
openshift-pipelines-operator-rh.v1.11.1 Red Hat OpenShift Pipelines 1.11.1 Succeeded
volsync-product.v0.7.4 VolSync 0.7.4 volsync-product.v0.7.3 Succeeded
Client Version: 4.14.0-ec.3
Kustomize Version: v5.0.1
Server Version: 4.14.0-0.nightly-2023-08-11-055332
Kubernetes Version: v1.27.4+deb2c60
How reproducible:
100%
Steps to Reproduce:
1. Deployed vm to openshift virtualization cluster from RHACM hub - vm is successfully deployed
2. Start the vm with 'virtctl start vm' - vm is running
3. Access the vm console - 'virtctl console vm', login and write data files
4. After about 10 minutes the VM shuts down with the message below:
5. Restart and access the vm, same happens. Reproduced this multiple times
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system poweroff
[ 687.879014] sd 1:0:0:0: [sda] Synchronizing SCSI cache
[ 687.880156] sd 1:0:0:0: [sda] Stopping disk
[ 687.973945] reboot: Power down
You were disconnected from the console. This has one of the following reasons:
- another user connected to the console of the target vm
- network issues
websocket: close 1006 (abnormal closure): unexpected EOF
Actual results:
VM shuts down unexpectedly with SIGTERM sent to all processes
Expected results:
VM should remain up and running
Additional info:
oc get pvc -n kevin-dr
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
sample-vm-pvc Bound pvc-c8112912-8ac9-4537-adaf-c9fd6089dee7 2Gi RWX ocs-external-storagecluster-ceph-rbd 42h
tmp-pvc Bound pvc-b08f240f-e828-49bb-9cf4-44ed8e8d9174 954Mi RWO ocs-external-storagecluster-ceph-rbd 7d22h
oc get vm -n kevin-dr
NAME AGE STATUS READY
sample-vm 42h Stopped False
[kgoldbla@localhost Metro_DR]$ virtctl start sample-vm -n kevin-dr
VM sample-vm was scheduled to start
[kgoldbla@localhost Metro_DR]$ virtctl console sample-vm -n kevin-dr
Successfully connected to sample-vm console. The escape sequence is ^]
login as 'cirros' user. default password: 'gocubsgo'. use 'sudo' for root.
sample-vm login: cirros
Password:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 1M 0 disk
vda 252:0 0 2G 0 disk
-vda1 252:1 0 2G 0 part / `-vda15 252:15 0 8M 0 part |
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Requesting system poweroff
[ 687.879014] sd 1:0:0:0: [sda] Synchronizing SCSI cache
[ 687.880156] sd 1:0:0:0: [sda] Stopping disk
[ 687.973945] reboot: Power down
You were disconnected from the console. This has one of the following reasons:
- another user connected to the console of the target vm
- network issues
websocket: close 1006 (abnormal closure): unexpected EOF
oc get vm sample-vm -n kevin-dr -oyaml
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
annotations:
apps.open-cluster-management.io/hosting-subscription: kevin-dr/kev-vm-dvtemplate-odr-metro-2-subscription-1
apps.open-cluster-management.io/reconcile-option: merge
kubevirt.io/latest-observed-api-version: v1
kubevirt.io/storage-observed-api-version: v1
creationTimestamp: "2023-09-04T16:08:30Z"
finalizers:
- kubevirt.io/virtualMachineControllerFinalize
generation: 13
labels:
app: kev-vm-dvtemplate-odr-metro-2
app.kubernetes.io/part-of: kev-vm-dvtemplate-odr-metro-2
appname: vm-dvtemplate-odr-metro
apps.open-cluster-management.io/reconcile-rate: medium
name: sample-vm
namespace: kevin-dr
resourceVersion: "26056866"
uid: cdbe619e-31f7-4778-a354-a6a2e11cacfd
spec:
dataVolumeTemplates: - metadata:
creationTimestamp: null
labels:
appname: vm-dvtemplate-odr-metro
name: sample-vm-pvc
spec:
source:
registry:
url: docker://quay.io/alitke/cirros:latest
storage:
resources:
requests:
storage: 2Gi
storageClassName: ocs-external-storagecluster-ceph-rbd
running: false
template:
metadata:
annotations:
vm.kubevirt.io/flavor: small
vm.kubevirt.io/os: fedora
vm.kubevirt.io/workload: server
creationTimestamp: null
labels:
kubevirt.io/size: small
spec:
architecture: amd64
domain:
cpu:
cores: 1
sockets: 1
threads: 1
devices:
disks: - disk:
bus: virtio
name: rootdisk - disk: {}
name: cloudinit
interfaces: - macAddress: 02:69:36:00:00:00
masquerade: {}
model: virtio
name: default
networkInterfaceMultiqueue: true
rng: {}
features:
acpi: {}
machine:
type: pc-q35-rhel8.6.0
resources:
requests:
memory: 2Gi
evictionStrategy: LiveMigrate
networks: - name: default
pod: {}
terminationGracePeriodSeconds: 180
volumes: - name: rootdisk
persistentVolumeClaim:
claimName: sample-vm-pvc - cloudInitNoCloud:
userData: |
#cloud-config
user: cirros
password: drftw!
chpasswd:
expire: false
name: cloudinit
status:
conditions: - lastProbeTime: "2023-09-06T11:06:40Z"
lastTransitionTime: "2023-09-06T11:06:40Z"
message: VMI does not exist
reason: VMINotExists
status: "False"
type: Ready - lastProbeTime: null
lastTransitionTime: null
status: "True"
type: LiveMigratable
desiredGeneration: 13
observedGeneration: 13
printableStatus: Stopped
volumeSnapshotStatuses: - enabled: true
name: rootdisk - enabled: false
name: cloudinit
reason: Snapshot is not supported for this volumeSource type [cloudinit]
- external trackers