Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-61182

Live migration VM with RWO backend storage does not update VMI.status.volumeStatus and breaks on 3rd migration

XMLWordPrintable

    • Customer Reported
    • None

      Description of problem:

      Cannot migrate a VM with RWO backend storage more than a couple times.

      Version-Release number of selected component (if applicable):

      4.18.2

      How reproducible:

      Always

      Steps to Reproduce:

      1. Initial state, the VM is running with RWO backend storage for persistent state

      # oc get pvc | grep persistent-state-for-windows-2022-1
      persistent-state-for-windows-2022-1-dpvc6             Bound    pvc-fcb456f6-c357-4685-9b63-730ffb9ace53   10Mi       RWO            ocs-storagecluster-ceph-rbd-virtualization   <unset>                 4m13s
      # oc get vmi
      NAME             AGE   PHASE     IP             NODENAME        READY
      windows-2022-1   44s   Running   192.168.1.21   red.home.arpa   True

      2. Migrate it the first time:

      We get a second PVC and it copies the persistent storage together with the live migration to the destination:

      # oc get pvc | grep persistent-state-for-windows-2022-1
      persistent-state-for-windows-2022-1-dpvc6             Bound    pvc-fcb456f6-c357-4685-9b63-730ffb9ace53   10Mi       RWO            ocs-storagecluster-ceph-rbd-virtualization   <unset>                 5m29s
      persistent-state-for-windows-2022-1-np2c5             Bound    pvc-f143809f-006c-408b-bb73-f151545c5ca7   10Mi       RWO            ocs-storagecluster-ceph-rbd-virtualization   <unset>                 20s

      3. The sucessful migration shows it:

          sourcePersistentStatePVCName: persistent-state-for-windows-2022-1-dpvc6
          ...
          targetPersistentStatePVCName: persistent-state-for-windows-2022-1-np2c5

      4. All good and well until here. But just note the VMI status shows the old PVC for backend storage, it was not updated:

        volumeStatus:
        - name: persistent-state-for-windows-2022-1-dpvc6
          persistentVolumeClaimInfo:
            accessModes:
            - ReadWriteOnce
            claimName: persistent-state-for-windows-2022-1-dpvc6
          target: ""

      5. Migrate again

      # virtctl migrate windows-2022-1
      VM windows-2022-1 was scheduled to migrate

      6. Now we are up to 3 PVCs, as it copied again to a new RWO volume (dc4ts)

      # oc get pvc | grep persistent-state-for-windows-2022-1
      persistent-state-for-windows-2022-1-dc4ts             Bound         pvc-1f69a341-b64a-4d91-aa13-80cf72839e93   10Mi       RWO            ocs-storagecluster-ceph-rbd-virtualization   <unset>                 21s
      persistent-state-for-windows-2022-1-dpvc6             Terminating   pvc-fcb456f6-c357-4685-9b63-730ffb9ace53   10Mi       RWO            ocs-storagecluster-ceph-rbd-virtualization   <unset>                 7m55s
      persistent-state-for-windows-2022-1-np2c5             Bound         pvc-f143809f-006c-408b-bb73-f151545c5ca7   10Mi       RWO            ocs-storagecluster-ceph-rbd-virtualization   <unset>                 2m46s

      And the migration shows it:

          sourcePersistentStatePVCName: persistent-state-for-windows-2022-1-dpvc6
      ...
          targetPersistentStatePVCName: persistent-state-for-windows-2022-1-dc4ts

      7. But now -dpvc6 is in terminating state, the backend storage is gone from the VMI volumestatuses, only the rootdisk is left in the VMI

      # oc get vmi windows-2022-1 -o yaml | yq '.status.volumeStatus'
      - name: rootdisk
        persistentVolumeClaimInfo:
          accessModes:
            - ReadWriteMany
          capacity:
            storage: 60Gi
          claimName: windows-2022-1
          filesystemOverhead: "0"
          requests:
            storage: "64424509440"
          volumeMode: Block
        target: vda

      8. Next migration doesn't work

      # virtctl migrate windows-2022-1
      VM windows-2022-1 was scheduled to migrate
      {"component":"virt-controller","kind":"","level":"info","msg":"expanding pdb for VMI homelab/windows-2022-1 to protect migration kubevirt-migrate-vm-gwqsd","name":"windows-2022-1","namespace":"homelab","pos":"migration.go:812","timestamp":"2025-05-07T04:09:47.704101Z","uid":"1b5f01b8-1dc5-4f91-a2c4-5a721bf2ee0e"}
      {"component":"virt-controller","level":"info","msg":"reenqueuing Migration homelab/kubevirt-migrate-vm-gwqsd","pos":"migration.go:249","reason":"no backend-storage PVC found in VMI volume status","timestamp":"2025-05-07T04:09:47.708983Z"}
      {"component":"virt-controller","level":"info","msg":"reenqueuing Migration homelab/kubevirt-migrate-vm-gwqsd","pos":"migration.go:249","reason":"no backend-storage PVC found in VMI volume status","timestamp":"2025-05-07T04:09:47.714636Z"}
      {"component":"virt-controller","level":"info","msg":"reenqueuing Migration homelab/kubevirt-migrate-vm-gwqsd","pos":"migration.go:249","reason":"no backend-storage PVC found in VMI volume status","timestamp":"2025-05-07T04:09:47.724859Z"}
      {"component":"virt-controller","level":"info","msg":"reenqueuing Migration homelab/kubevirt-migrate-vm-gwqsd","pos":"migration.go:249","reason":"no backend-storage PVC found in VMI volume status","timestamp":"2025-05-07T04:09:47.745039Z"}
      {"component":"virt-controller","level":"info","msg":"reenqueuing Migration homelab/kubevirt-migrate-vm-gwqsd","pos":"migration.go:249","reason":"no backend-storage PVC found in VMI volume status","timestamp":"2025-05-07T04:09:47.785289Z"}
      {"component":"virt-controller","level":"info","msg":"reenqueuing Migration homelab/kubevirt-migrate-vm-gwqsd","pos":"migration.go:249","reason":"no backend-storage PVC found in VMI volume status","timestamp":"2025-05-07T04:09:47.867338Z"}
      {"component":"virt-controller","level":"info","msg":"reenqueuing Migration homelab/kubevirt-migrate-vm-gwqsd","pos":"migration.go:249","reason":"no backend-storage PVC found in VMI volume status","timestamp":"2025-05-07T04:09:48.028400Z"}
      
      

      It can't find the backend because the volume is now missing from the VMI.status.volumeStatus, so CurrentPVCName() cannot find it as it iterates the volumes from there to match it.

      https://github.com/kubevirt/kubevirt/blob/main/pkg/storage/backend-storage/backend-storage.go#L261

              jelejosne Jed Lejosne
              rhn-support-gveitmic Germano Veit Michel
              Kedar Bidarkar Kedar Bidarkar
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: