Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-145179

Libvirt errors operations after destroying VM at a specific time during postcopy migration

Linking RHIVOS CVEs to...Migration: Automation ...Sync from "Extern...XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • rhel-9.8
    • None
    • None
    • Moderate
    • 1
    • rhel-virt-core-libvirt-1
    • 5
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • Libvirt Bugs already in Sprint
    • None
    • None
    • Unspecified
    • Unspecified
    • Unspecified
    • All
    • None

      What were you trying to do that didn't work?

      Start a VM locally that stopped (destroyed) after switching to postcopy.

      What is the impact of this issue to you?

      Low, test case fails but the VM can be started eventually.

      Please provide the package NVR for which the bug is seen:

      libvirt-11.10.0-3.el9

      How reproducible is this bug?:

      100%

      Steps to reproduce

      1. Set up shared storage live migration
      2. Start migration
         virsh migrate --live --p2p --verbose --domain avocado-vt-vm1 --desturi qemu+tcp://10.0.160.70/system --bandwidth 10 --postcopy-bandwidth 10 --postcopy
      3. Switch to postcopy, wait shortly and destroy the VM
        virsh migrate-postcopy avocado-vt-vm1; sleep 0.5; virsh destroy avocado-vt-vm1; virsh start avocado-vt-vm1
      4. Try to start the VM
        virsh start avocado-vt-vm1

      Expected results

      The VM can be started

      Actual results

      The VM can't be started immediately although virsh list confirms its shut off. But it will become 'runnable' after a while without further intervention.

      error: Failed to start domain 'avocado-vt-vm1'
      error: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainMigratePerform3Params)
      
      # virsh list --all
       Id   Name             State
      ---------------------------------
       -    avocado-vt-vm1   shut off
      
      # virhs start avocado-vt-vm1
      -bash: virhs: command not found
      # virsh start avocado-vt-vm1
      error: Failed to start domain 'avocado-vt-vm1'
      error: Timed out during operation: cannot acquire state change lock (held by monitor=remoteDispatchDomainMigratePerform3Params)
      
      # virsh start avocado-vt-vm1
      Domain 'avocado-vt-vm1' started
      

      Additional information

      1. Caught by test case migration.async_ops.destroy_vm_during_finishphase.destroy_src_vm.with_postcopy.p2p on both x86_64 and s390x
      2. As a result of destroying the VM, the Migration stops, this can happen in different ways:
        1. 'domain X not running' in stderror
        2. no specific error message in stderror, migrate returns with 1 and stderror is just a cut of list of "Migration x %" messages
        3. job 'migration in' failed in post-copy phase
      3. The test case currently only considers the last of the above error exits
      4. The wait time of 0.5 is really important; not waiting and waiting for 1 second didnt' reproduce the issue for me

        1. virtqemud.log
          100 kB
          Jiri Denemark

              jdenemar@redhat.com Jiri Denemark
              smitterl@redhat.com Sebastian Mitterle
              virt-maint virt-maint
              Liping Cheng Liping Cheng
              Votes:
              0 Vote for this issue
              Watchers:
              9 Start watching this issue

                Created:
                Updated: