Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-73085

Libvirt report post copy migration resume failed but actually success in some cases

    • libvirt-10.10.0-4.el9
    • No
    • Important
    • rhel-sst-virtualization
    • ssg_virtualization
    • 3
    • Dev ack
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • x86_64
    • None

      What were you trying to do that didn't work?

      Libvirt report post copy migration resume failed but actually success in some cases

      What is the impact of this issue to you?

      Libvirt report an error when resume a post copy migration in some cases

      Please provide the package NVR for which the bug is seen:

      libvirt-10.10.0-3.el9.x86_64
      qemu-kvm-9.1.0-5.el9.x86_64

      How reproducible is this bug?:

      100% in automation environment

      Steps to reproduce

      1. prepare a migration environment and an active vm
      # virsh start vm1
      Domain 'vm1' started
      
      2. perform postcopy migration and abort migration
      Terminal1:
      # virsh migrate --live --p2p --verbose --domain vm1 --desturi qemu+tcp://target_host/system  --postcopy --bandwidth 5 --tls
      
      Terminal2:
      virsh # migrate-postcopy vm1
      virsh # domjobabort vm1 --postcopy
      
      3. resume postcopy migration
      # virsh migrate --live --p2p --verbose --domain vm1 --desturi qemu+tcp://target_host/system  --postcopy --bandwidth 5 --tls --postcopy-resume
      error: operation failed: job 'migration in' failed in post-copy phase
      
      4. After a few minutes, the guest was running on target host and shutdown on the source guest
      # virsh list --all
       Id   Name   State
      ----------------------
       2    vm1    running
      
      5. check target host virtqemud log and can find that post-copy resume actually success
      

      Expected results

      Should not report error if recover successfully.

      Actual results

      Recover postcopy returned with error '"job 'migration in' failed in post-copy phase' but the migration was recovered successfully in fact.

      Additional info

      This issue not happened on every test environment. I compared the different between the "good" and "bad" environment and found that looks like in the "bad" environment qemu sent postcopy-recover event later than the "good" environment.

              jdenemar@redhat.com Jiri Denemark
              rhn-support-lhuang Luyao Huang
              Jiri Denemark Jiri Denemark
              Luyao Huang Luyao Huang
              Votes:
              0 Vote for this issue
              Watchers:
              11 Start watching this issue

                Created:
                Updated: