-
Bug
-
Resolution: Unresolved
-
Major
-
4.16, 4.17
Description of problem:
When a kubevirt-csi pod runs on a worker node of a Guest cluster, the underlying PVC from the infra/host cluster is attached to the Virtual Machine that is the worker node of the Guest cluster. That works well, but only until the VM is rebooted. After the VM is power cycled for some reason, the volumeattachment on the Guest cluster is still there and shows as attached. [guest cluster]# oc get volumeattachment NAME ATTACHER PV NODE ATTACHED AGE csi-976b6b166ef7ea378de9a350c9ef427c23e8c072dc6e76a392241d273c3effdb csi.kubevirt.io pvc-4e375fa9-c1ad-4fa6-a254-03d4c3b1111b hostedcluster2-rlq9m-z2x88 true 39m But the VM does not have the hotplugged disk anymore (its not a persistent hotplug). Its not attached at all. It only has its rhcos disk and cloud-init after the reboot: [host cluster]# oc get vmi -n clusters-hostedcluster2 hostedcluster2-rlq9m-z2x88 -o yaml | yq '.status.volumeStatus' - name: cloudinitvolume size: 1048576 target: vdb - name: rhcos persistentVolumeClaimInfo: accessModes: - ReadWriteOnce capacity: storage: 32Gi claimName: hostedcluster2-rlq9m-z2x88-rhcos filesystemOverhead: "0" requests: storage: "34359738368" volumeMode: Block target: vda The result is all workloads with PVCs now fail to start, as the hotplug is not triggered again. The worker node VM cannot find the disk: 26s Warning FailedMount pod/mypod MountVolume.MountDevice failed for volume "pvc-4e375fa9-c1ad-4fa6-a254-03d4c3b1111b" : rpc error: code = Unknown desc = couldn't find device by serial id So workload pods cannot start.
Version-Release number of selected component (if applicable):
OCP 4.17.3 CNV 4.17.0 MCE 2.7.0
How reproducible:
Always
Steps to Reproduce:
1. Have a pod running with a PV from kubevirt-csi in the guest cluster 2. Shutdown the Worker VM running the Pod and start it again
Actual results:
Workloads fail to start after VM reboot
Expected results:
Hotplug the disk again and let workloads start
Additional info:
- blocks
-
OCPBUGS-44623 VolumeAttachment does not reconcile on worker VM reboot
- POST
-
OCPBUGS-44624 VolumeAttachment does not reconcile on worker VM reboot
- Closed
- is cloned by
-
OCPBUGS-44625 VolumeAttachment does not reconcile on worker VM reboot 4.19
- New
-
OCPBUGS-44622 VolumeAttachment does not reconcile on worker VM reboot
- POST
-
OCPBUGS-44623 VolumeAttachment does not reconcile on worker VM reboot
- POST
-
OCPBUGS-44624 VolumeAttachment does not reconcile on worker VM reboot
- Closed
- links to