• qemu-kvm-8.2.0-3.el9
    • None
    • None
    • rhel-sst-virtualization
    • ssg_virtualization
    • 19
    • 23
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:
      qemu crashed when migrate guest with blob resources enabled

      Version-Release number of selected component (if applicable):
      libvirt-9.5.0-6.el9.x86_64
      qemu-kvm-8.0.0-13.el9.x86_64

      How reproducible:
      100%

      Steps to Reproduce:
      1. prepare a guest with the following xml snippet

      1. virsh dumpxml lizhu --xpath //video
        <video>
        <model type="virtio" heads="1" primary="yes" blob="on"/>
        <alias name="video0"/>
        <address type="pci" domain="0x0000" bus="0x00" slot="0x01" function="0x0"/>
        </video>

      2. migrate the guest to another host

      1. virsh migrate lizhu qemu+ssh://$target_host/system --verbose --live
        Migration: [98.32 %]error: operation failed: domain is not running

      3. check the qemu log on target host
      #cat /var/log/libvirt/qemu/lizhu.log
      ...
      2023-08-30 14:35:30.653+0000: Domain id=1 is tainted: host-cpu
      char device redirected to /dev/pts/2 (label charserial0)
      2023-08-30T14:35:37.352236Z qemu-kvm: Failed to load virtio-gpu:virtio-gpu
      2023-08-30T14:35:37.352266Z qemu-kvm: error while loading state for
      instance 0x0 of device '0000:00:01.0/virtio-gpu'
      2023-08-30T14:35:37.352442Z qemu-kvm: load of migration failed: Invalid argument
      2023-08-30 14:35:37.754+0000: shutting down, reason=crashed

      4. check the qemu log on source host
      #cat /var/log/libvirt/qemu/lizhu.log
      ...
      2023-08-30 14:35:30.974+0000: initiating migration
      2023-08-30 14:35:37.789+0000: shutting down, reason=crashed

      5. check the guest states on source host

      1. virsh domstate lizhu --reason
        shut off (unknown)

      Actual results:
      Migration failed, qemu crashed

      Expected results:
      Qemu should not crash.

            [RHEL-7565] qemu crashed when migrate guest with blob resources enabled

            Errata Tool added a comment -

            Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

            For information on the advisory (Moderate: qemu-kvm security update), and where to find the updated files, follow the link below.

            If the solution does not work for you, open a new bug report.
            https://access.redhat.com/errata/RHSA-2024:2135

            Errata Tool added a comment - Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (Moderate: qemu-kvm security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2024:2135

            Zhiyi Guo added a comment -

            Simply verify the latest behavior - virtio-vga/gpu is not migratble:.

            step1:

            Boot a VM with blob enabled:

             

            ...  
            <memoryBacking>
              <source type='memfd'/>
            </memoryBacking>
            ...
            <video>
              <model type='virtio' heads='1' primary='yes' blob="on"/>
            </video>
            ...

             

            step2:

            Save the VM's state to a file:

             

            # virsh save rhel current.stat 

            After step2, it will prompt:

            error: Failed to save domain 'rhel' to current.stat
            error: Requested operation is not valid: cannot migrate domain: virtio-gpu blob VMs are currently not migratable.

             

            Zhiyi Guo added a comment - Simply verify the latest behavior - virtio-vga/gpu is not migratble:. step1: Boot a VM with blob enabled:   ...  <memoryBacking>   <source type= 'memfd' /> </memoryBacking> ... <video>   <model type= 'virtio' heads= '1' primary= 'yes' blob= "on" /> </video> ...   step2: Save the VM's state to a file:   # virsh save rhel current.stat After step2, it will prompt: error: Failed to save domain 'rhel' to current.stat error: Requested operation is not valid: cannot migrate domain: virtio-gpu blob VMs are currently not migratable.  

            Fix included in qemu-kvm-8.2.0-3.el9

            Fixed by merge request 'virtio-gpu: block migration of VMs with blob=true' ( https://gitlab.com/redhat/centos-stream/src/qemu-kvm/-/merge_requests/217 )

            Miroslav Rezanina added a comment - Fix included in qemu-kvm-8.2.0-3.el9 Fixed by merge request 'virtio-gpu: block migration of VMs with blob=true' ( https://gitlab.com/redhat/centos-stream/src/qemu-kvm/-/merge_requests/217 )

            gitlab-bot added a comment -

            Miroslav Rezanina mentioned this issue in a merge request of Red Hat / centos-stream / rpms / qemu-kvm on branch next:

            Update to qemu-kvm-8.2.0-3.el9

            gitlab-bot added a comment - Miroslav Rezanina mentioned this issue in a merge request of Red Hat / centos-stream / rpms / qemu-kvm on branch next : Update to qemu-kvm-8.2.0-3.el9

            upstream patches for 9.0 "[PATCH 0/2] virtio-gpu: fix blob scanout post-load". Most likely not good candidates for backport, given they change the migration stream version.

            Marc-Andre Lureau added a comment - upstream patches for 9.0 " [PATCH 0/2] virtio-gpu: fix blob scanout post-load". Most likely not good candidates for backport, given they change the migration stream version.

            rhn-support-zhguo as long as support for blob migration isn't a customer requirement, I think this is the best thing to do. I am going to send a patch for upstream, but it will bump the migration stream version, and hence not a suitable change for a stable version.

            Marc-Andre Lureau added a comment - rhn-support-zhguo as long as support for blob migration isn't a customer requirement, I think this is the best thing to do. I am going to send a patch for upstream, but it will bump the migration stream version, and hence not a suitable change for a stable version.

            Zhiyi Guo added a comment -

            Sigh, this is going to be a bit more difficult to handle. Maybe we should just revert the support for migration at this point in RHEL (revert commit 10b9ddbc83b94986cbdf989e26fb7269fb2e9f72).

            I do agree no crash happen is the best option for rhel 9.4. If you choose this path, I can modify the title of this Jira issue and open a new issue for tracking the live migration capability. WDYT?

            Zhiyi Guo added a comment - Sigh, this is going to be a bit more difficult to handle. Maybe we should just revert the support for migration at this point in RHEL (revert commit 10b9ddbc83b94986cbdf989e26fb7269fb2e9f72). I do agree no crash happen is the best option for rhel 9.4. If you choose this path, I can modify the title of this Jira issue and open a new issue for tracking the live migration capability. WDYT?

            I didn't realize blob resources could be used for scanouts, and didn't reach that code path during testing.

             

            Sigh, this is going to be a bit more difficult to handle. Maybe we should just revert the support for migration at this point in RHEL (revert commit 10b9ddbc83b94986cbdf989e26fb7269fb2e9f72).

            Marc-Andre Lureau added a comment - I didn't realize blob resources could be used for scanouts, and didn't reach that code path during testing.   Sigh, this is going to be a bit more difficult to handle. Maybe we should just revert the support for migration at this point in RHEL (revert commit 10b9ddbc83b94986cbdf989e26fb7269fb2e9f72).

            Zhiyi Guo added a comment -

            pixman version we had for rhel 9.4 is pixman-0.40.0-6.el9.x86_64

            Zhiyi Guo added a comment - pixman version we had for rhel 9.4 is pixman-0.40.0-6.el9.x86_64

            Zhiyi Guo added a comment -

            Please check?

            Zhiyi Guo added a comment - Please check?

              mlureau Marc-Andre Lureau
              rhn-support-lizhu Lili Zhu
              Zhiyi Guo Zhiyi Guo
              Votes:
              0 Vote for this issue
              Watchers:
              10 Start watching this issue

                Created:
                Updated:
                Resolved: