Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-7098

[vfio migration] The Q35 + OVMF VM with a mlx5_vfio_pci VF can not be migrated

Details

    • sst_virtualization
    • ssg_virtualization
    • 19
    • 21
    • False
    • Hide

      None

      Show
      None
    • If docs needed, set a value
    • x86_64
    • 8.2.0

    Description

      Description of problem:
      The Q35 + OVMF VM with a mlx5_vfio_pci VF can not be migrated

      Version-Release number of selected component (if applicable):
      host:
      5.14.0-355.el9.x86_64
      qemu-kvm-8.0.0-13.el9.x86_64
      libvirt-9.7.0-1.el9.x86_64
      edk2-ovmf-20230524-3.el9.noarch
      VM:
      5.14.0-355.el9.x86_64

      How reproducible:
      100%

      Steps to Reproduce:
      1. create a MT2910 VF and setup the VF for migration

      2. start a Q35 + OVMF VM with a mlx5_vfio_pci VF

      <domain type="kvm">
        <name>rhel93</name>
        <uuid>9403cac2-9135-4d85-ab63-98bcdf8a5042</uuid>
        <memory>4194304</memory>
        <currentMemory>4194304</currentMemory>
        <vcpu>4</vcpu>
        <os firmware="efi">
          <type arch="x86_64" machine="q35">hvm</type>
          <boot dev="hd"/>
        </os>
        <features>
          <acpi/>
          <apic/>
        </features>
        <cpu mode="host-model"/>
        <clock offset="utc">
          <timer name="rtc" tickpolicy="catchup"/>
          <timer name="pit" tickpolicy="delay"/>
          <timer name="hpet" present="no"/>
        </clock>
        <pm>
          <suspend-to-mem enabled="no"/>
          <suspend-to-disk enabled="no"/>
        </pm>
        <devices>
          <emulator>/usr/libexec/qemu-kvm</emulator>
          <disk type="file" device="disk">
            <driver name="qemu" type="qcow2" cache="none" io="threads"/>
            <source file="/home/images/migration/RHEL93.qcow2"/>
            <target dev="vda" bus="virtio"/>
          </disk>
          <controller type="usb" model="ich9-ehci1"/>
          <controller type="usb" model="ich9-uhci1">
            <master startport="0"/>
          </controller>
          <controller type="usb" model="ich9-uhci2">
            <master startport="2"/>
          </controller>
          <controller type="usb" model="ich9-uhci3">
            <master startport="4"/>
          </controller>
          <controller type="pci" model="pcie-root"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <controller type="pci" model="pcie-root-port"/>
          <console type="pty"/>
          <input type="tablet" bus="usb"/>
          <tpm model="tpm-crb">
            <backend type="emulator"/>
          </tpm>
          <graphics type="vnc" port="5993" listen="0.0.0.0"/>
          <video>
            <model type="bochs"/>
          </video>
          <hostdev mode="subsystem" type="pci" managed="no">
            <source>
              <address domain="0" bus="0xb1" slot="0x0" function="0x02"/>
            </source>
          </hostdev>
        </devices>
      </domain>
      

      3. check the MT2910 VF in the VM

      # ifconfig 
      enp2s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
              inet6 fe80::b35b:2dac:371a:4fa3  prefixlen 64  scopeid 0x20<link>
              ether 52:54:00:01:01:01  txqueuelen 1000  (Ethernet)
              RX packets 0  bytes 0 (0.0 B)
              RX errors 0  dropped 0  overruns 0  frame 0
              TX packets 28  bytes 4568 (4.4 KiB)
              TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
      
      # dmesg | grep -Ei "mlx5|enp2s0"
      [    3.953753] mlx5_core 0000:02:00.0: firmware version: 28.37.1014
      [    4.112107] mlx5_core 0000:02:00.0: Rate limit: 127 rates are supported, range: 0Mbps to 97656Mbps
      [    4.217499] mlx5_core 0000:02:00.0: Supported tc offload range - chains: 1, prios: 1
      [    4.220999] mlx5_core 0000:02:00.0: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0 enhanced)
      [    4.242258] mlx5_core 0000:02:00.0 enp2s0: renamed from eth0
      [    5.361400] mlx5_core 0000:02:00.0 enp2s0: Link up
      [    5.362911] IPv6: ADDRCONF(NETDEV_CHANGE): enp2s0: link becomes ready
      

      4. migrate the VM

      $ sudo virsh migrate --verbose --live rhel93 qemu+ssh://10.8.3.15/system
      error: operation failed: job 'migration out' unexpectedly failed
      

      5. check the qemu-kvm log

      $ sudo tail -f /var/log/libvirt/qemu/rhel93.log
      ...
      2023-09-01 05:50:42.169+0000: initiating migration
      2023-09-01T05:50:42.175652Z qemu-kvm: 0000:b1:00.2: Failed to start DMA logging, err -95 (Operation not supported)
      2023-09-01T05:50:42.175777Z qemu-kvm: vfio: Could not start dirty page tracking, err: -95 (Operation not supported)
      2023-09-01T05:50:42.378843Z qemu-kvm: Unable to read from socket: Bad file descriptor
      2023-09-01T05:50:42.378866Z qemu-kvm: Unable to read from socket: Bad file descriptor
      2023-09-01T05:50:42.378872Z qemu-kvm: Unable to read from socket: Bad file descriptor
      

      Actual results:
      The Q35 + OVMF VM with a mlx5_vfio_pci VF can not be migrated

      Expected results:
      The Q35 + OVMF VM with a mlx5_vfio_pci VF can be migrated

      Additional info:
      (1) How to create a MT2910 VF and setup the VF for migration ?

      1.1 load the mlx5_vfio_pci module
      
      # modprobe mlx5_vfio_pci
      
      1.2 create VF
      
      # sudo sh -c "echo 0 > /sys/bus/pci/devices/0000:b1:00.0/sriov_numvfs"
      # sudo sh -c "echo 1 > /sys/bus/pci/devices/0000:b1:00.0/sriov_numvfs"
      
      1.3 set VF mac
      
      # sudo sh -c  "ip link set ens2f0np0 vf 0 mac 52:54:00:01:01:01"
      
      1.4 unbind created VF from driver
      
      # sudo sh -c  "echo 0000:b1:00.2 > /sys/bus/pci/drivers/mlx5_core/unbind"
      
      1.5 set switchdev mode on PF
      
      # sudo sh -c "devlink dev eswitch set pci/0000:b1:00.0 mode switchdev"
      # sudo sh -c "devlink dev eswitch show pci/0000:b1:00.0"
          pci/0000:b1:00.0: mode switchdev inline-mode none encap-mode basic
      
      1.6 enable VF's migration feature
      
      # sudo sh -c "devlink port function set pci/0000:b1:00.0/1 migratable enable"
      # sudo sh -c  "devlink port show pci/0000:b1:00.0/1"
      	  …
        function:
          hw_addr 52:54:00:01:01:01 roce enable migratable enable
      
      1.7 bind VF to mlx5_vfio_pci driver
      
      # sudo sh -c "echo '15b3 101e' > /sys/bus/pci/drivers/mlx5_vfio_pci/new_id"
      # sudo sh -c "echo '15b3 101e' > /sys/bus/pci/drivers/mlx5_vfio_pci/remove_id"
      # readlink -f /sys/bus/pci/devices/0000\:b1\:00.2/driver
        /sys/bus/pci/drivers/mlx5_vfio_pci
      

      Attachments

        Issue Links

          Activity

            People

              rh-ee-clegoate Cédric Le Goater
              yanghliu@redhat.com YangHang Liu
              Cédric Le Goater Cédric Le Goater
              YangHang Liu YangHang Liu
              Votes:
              0 Vote for this issue
              Watchers:
              19 Start watching this issue

              Dates

                Created:
                Updated: