Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-23028

[2149654] [4.12] VMSnaphot and WaitForFirstConsumer storage: VMRestore is not Complete

    XMLWordPrintable

Details

    • False
    • Hide

      None

      Show
      None
    • False
    • CLOSED
    • Release Notes
    • Hide
      Cause: When restoring a VMSnapshot with storage that uses a WaitForFirstConsumer binding mode

      Consequence: the restored PVCs will remain in pending state and the restore operation will appear to be stuck.

      Workaround (if any): A workaround is to start the restored VM, stop it, and then start again.

      Result: This will cause the VM to be scheduled, the PVCs will become Bound and the VMSnapshotRestore operation will be completed.
      Show
      Cause: When restoring a VMSnapshot with storage that uses a WaitForFirstConsumer binding mode Consequence: the restored PVCs will remain in pending state and the restore operation will appear to be stuck. Workaround (if any): A workaround is to start the restored VM, stop it, and then start again. Result: This will cause the VM to be scheduled, the PVCs will become Bound and the VMSnapshotRestore operation will be completed.
    • Known Issue
    • Done
    • ---
    • ---
    • Storage Core Sprint 233, Storage Core Sprint 234, Storage Core Sprint 235
    • High

    Description

      Description of problem:
      VMRestore doesn't get to the Complete state,
      restore DV stays WaitForFirstConsumer,
      restore PVC is Pending
      restore VM is Stopped and not Ready

      Version-Release number of selected component (if applicable):
      4.12

      How reproducible:
      Always on SNO cluster with snapshot capable storage with WaitForFirstConsumer volumeBindingMode (TopoLVM storage in our case - odf-lvm-vg1)

      Steps to Reproduce:
      1. Create a VM - VM is Running
      2. Create a VMSnapshot - VMSnapshot is ReadyToUse
      3. Create a VMRestore

      Actual results:
      VMRestore is not Complete

      $ oc get vmrestore
      NAME            TARGETKIND       TARGETNAME    COMPLETE   RESTORETIME   ERROR
      restore-my-vm   VirtualMachine   vm-restored   false  

      Expected results:
      VMRestore is Complete (PVC Bound, DV Succeded and garbage collected)

      Workaround and ONE MORE ISSUE:
      1. Start the restored VM
      2. See the VM is Ready and Running, DV succeeded, PVC Bound
      3. See the VMRestore is still not Complete:

      $ oc get vmrestore
      NAME            TARGETKIND       TARGETNAME    COMPLETE   RESTORETIME   ERROR
      restore-my-vm   VirtualMachine   vm-restored   false  

      $ oc describe vmrestore restore-my-vm | grep Events -A 10
      Events:
        Type     Reason                      Age                    From                Message
        ----     ------                      ----                   ----
        Warning  VirtualMachineRestoreError  4m4s (x23 over 4m21s)  restore-controller  VirtualMachineRestore encountered error invalid RunStrategy "Always"

      4. See the restored VM runStrategy:
      $ oc get vm vm-restored -oyaml | grep running
          running: true

      ***
      PLEASE NOTE that the restored VM on OCS with Immediate volumeBindingMode on the multi-node cluster gets the "running: false", despite that the source VM had it "true", and we are not getting the above error, and VMRestore becomes Complete:
      $ oc get vm vm-restored-ocs -oyaml | grep running
        running: false
      ***

      5. Stop the restored VM
      6. See the VMRestore is Complete:
      $ oc get vmrestore
      NAME            TARGETKIND       TARGETNAME    COMPLETE   RESTORETIME   ERROR
      restore-my-vm   VirtualMachine   vm-restored   true       1s            

      Additional info:

      VM yaml: 

      $ cat vm.yaml
      apiVersion: kubevirt.io/v1alpha3
      kind: VirtualMachine
      metadata:
        name: vm-cirros-source
        labels:
          kubevirt.io/vm: vm-cirros-source
      spec:
        dataVolumeTemplates:
        - metadata:
            name: cirros-dv-source
          spec:
            storage:
              resources:
                requests:
                  storage: 1Gi
              storageClassName: odf-lvm-vg1
            source:
              http:
                url: <cirros-0.4.0-x86_64-disk.qcow2>
        running: true
        template:
          metadata:
            labels:
              kubevirt.io/vm: vm-cirros-source
          spec:
            domain:
              devices:
                disks:
                - disk:
                    bus: virtio
                  name: datavolumev
              machine:
                type: ""
              resources:
                requests:
                  memory: 100M
            terminationGracePeriodSeconds: 0
            volumes:
            - dataVolume:
                name: cirros-dv-source
              name: datavolumev

      VMSnapshot yaml:

      $ cat snap.yaml
      apiVersion: snapshot.kubevirt.io/v1alpha1
      kind: VirtualMachineSnapshot
      metadata:
        name: my-vmsnapshot
      spec:
        source:
          apiGroup: kubevirt.io
          kind: VirtualMachine
          name: vm-cirros-source

      VMRestore yaml:

      $ cat vmrestore.yaml
      apiVersion: snapshot.kubevirt.io/v1alpha1
      kind: VirtualMachineRestore
      metadata:
        name: restore-my-vm
      spec:
        target:
          apiGroup: kubevirt.io
          kind: VirtualMachine
          name: vm-restored
        virtualMachineSnapshotName: my-vmsnapshot

      Attachments

        Issue Links

          Activity

            People

              skagan@redhat.com Shelly Kagan
              jpeimer@redhat.com Jenia Peimer
              Jenia Peimer Jenia Peimer
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: