-
Bug
-
Resolution: Unresolved
-
Critical
-
CNV v4.16.1
-
5
-
False
-
-
False
-
CNV v4.99.0.rhel9-1214, CNV v4.16.3.rhel9-87, CNV v4.15.7.rhel9-21
-
---
-
---
-
-
Storage Core Sprint 259, Storage Core Sprint 261
-
Urgent
-
None
Description of problem:
The following sequence of events results in failure: 1. Have some template/golden image as PVC in namespace X 2. Allow cloning cross namespace (see additional info) 3. Create a new VM on namespace Y, using Clone PVC option from image from namespace X from step 1 4. Snapshot this VM 5. Delete namespace X 6. Restore the snapshot of the VM The VirtualMachineRestore gets stuck, as it cannot create the DV anymore.
Version-Release number of selected component (if applicable):
4.16.1
How reproducible:
Always
Steps to Reproduce:
1. Create an empty PVC named my-disk on a namespace called my-images$ cat disk.yaml kind: PersistentVolumeClaim apiVersion: v1 metadata: name: my-disk namespace: my-images spec: accessModes: - ReadWriteOnce resources: requests: storage: 10Gi storageClassName: lvms-ssd $ oc get pvc -n my-images NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS VOLUMEATTRIBUTESCLASS AGE my-disk Bound pvc-839d1030-c5a7-4ee5-9dd3-4b2018cdcd1a 10Gi RWO lvms-ssd <unset> 57s 2. Ensure you can clone from my-images to another namespace where VMs are created 3. In the namespace where VMs are created (not my-images), create a new VM from template (I've used RHEL8, but not relevant) Disk Source: PVC (clone PVC) PVC Project: my-images PVC Name: my-disk The DV will look like this (I've created mine on homelab namespace)spec: preallocation: false source: pvc: name: my-disk namespace: my-images storage: resources: requests: storage: 10Gi storageClassName: lvms-ssd4. Ensure the VM created fine 5. In the Web Console, create a VM snapshot of the new VM $ oc get vmsnapshot NAME SOURCEKIND SOURCENAME PHASE READYTOUSE CREATIONTIME ERROR snapshot-cyan-cockroach-53 VirtualMachine rhel8-aqua-asp-20 Succeeded true 4s 6. Now try to restore that snapshot, also in the Web Console 7. All works 8. Now delete the original my-images/my-disk (its not needed really, the VM is a clone of that) $ oc delete pvc -n my-images my-disk persistentvolumeclaim "my-disk" deleted $ oc delete project my-images project.project.openshift.io "my-images" deleted 9. Try to restore the snapshot again, it got stuck here: $ oc get virtualmachinerestore resotre-snapshot-cyan-cockroach-53-1724733464456 -o yaml apiVersion: snapshot.kubevirt.io/v1alpha1 kind: VirtualMachineRestore metadata: creationTimestamp: "2024-08-27T04:37:45Z" generation: 5 name: resotre-snapshot-cyan-cockroach-53-1724733464456 namespace: homelab ownerReferences: - apiVersion: kubevirt.io/v1 blockOwnerDeletion: false kind: VirtualMachine name: rhel8-aqua-asp-20 uid: 7b75bc2b-e13a-455e-8d9a-5abceb3c957d resourceVersion: "37572385" uid: d0468b34-488e-4404-8623-906f28d0f7d0 spec: target: apiGroup: kubevirt.io kind: VirtualMachine name: rhel8-aqua-asp-20 virtualMachineSnapshotName: snapshot-cyan-cockroach-53 status: complete: false conditions: - lastProbeTime: null lastTransitionTime: "2024-08-27T04:37:45Z" reason: 'admission webhook "virtualmachine-validator.kubevirt.io" denied the request: namespace my-images does not exist' status: "False" type: Progressing - lastProbeTime: null lastTransitionTime: "2024-08-27T04:37:45Z" reason: 'admission webhook "virtualmachine-validator.kubevirt.io" denied the request: namespace my-images does not exist' status: "False" type: Ready deletedDataVolumes: - restore-f5b3cb99-ed07-4597-b875-25fdbfbcd79b-disk-chocolate-pelican-74 restores: - dataVolumeName: restore-d0468b34-488e-4404-8623-906f28d0f7d0-disk-chocolate-pelican-74 persistentVolumeClaim: restore-d0468b34-488e-4404-8623-906f28d0f7d0-disk-chocolate-pelican-74 volumeName: disk-chocolate-pelican-74 volumeSnapshotName: vmsnapshot-4e5c7c55-ca8e-4a31-ae74-fc25dec54073-volume-disk-chocolate-pelican-74 Now the user is unable to restore the VM from a backup as the original (now unrelated) namespace/pvc don't exist anymore
Actual results:
Unable to restore snapshot of the VM using VirtualMachineRestore
Expected results:
Able to restore snapshot of VM
Additional info:
https://docs.openshift.com/container-platform/4.16/virt/storage/virt-enabling-user-permissions-to-clone-datavolumes.html The customer (in 4.15) got different error but in the exact same place, as if the source is the problem: Failed to create restore DataVolume: admission webhook "datavolume-validate.cdi.kubevirt.io" denied the request: Data volume should have either Source or SourceRef, or be externally populated' But also failed at the same step of creating the restore DV. But they are on 4.15, this is in latest 4.16.1. This should work without a manual restore, as customers may need to urgently roll back their VMs.
- is cloned by
-
CNV-48692 [4.17] Unable to restore a snapshot if the original DataVolume clone source is from a namespace/pvc that was deleted
- Verified
- is related to
-
CNV-47106 Unable to start VM after stuck/failed VirtualMachineRestore
- ON_QA
-
CNV-48787 [4.17] Unable to start VM after stuck/failed VirtualMachineRestore
- ON_QA
-
CNV-48788 [4.15] Unable to start VM after stuck/failed VirtualMachineRestore
- ON_QA
- links to
-
RHEA-2024:139317 OpenShift Virtualization 4.16.4 Images
- mentioned on