Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-16581

[nested][emulated igb] Cannot reset device , depends on group which is not owned.

    • qemu-kvm-8.2.0-1.el9
    • None
    • Moderate
    • d90014fc337ab77f37285b1a30fd4f545056be0a
    • rhel-sst-virtualization-networking
    • ssg_virtualization
    • 22
    • 24
    • 1
    • QE ack, Dev ack
    • False
    • Hide

      None

      Show
      None
    • No
    • Red Hat Enterprise Linux
    • None
    • x86_64
    • Linux
    • None

      What were you trying to do that didn't work?
      start a L1 VM with a emulated igb, create a VF(igbvf driver) from the emulated igb in the L1 VM and then start a L2 VM with the igbvf , the L2 VM throws some warnings/errors.

      Please provide the package NVR for which bug is seen:
      Test env:
      host:
      qemu-kvm-8.1.0-4.el9.x86_64
      libvirt-9.5.0-7.el9_3.x86_64
      5.14.0-378.el9.x86_64
      L1 guest:
      5.14.0-384.el9.x86_64

      How reproducible:
      100%

      Steps to reproduce
      Test scenario :
      [1] start a Q35 + OVMF L1 VM with a emulated igb
      [2] create a VF from the emulated igb in the L1 VM
      [3] start a Q35 + OVMF L2 VM with a igbvf
      [4] do ping tests
      [5] check the qemu-kvm and kernel log

      The detailed test step:
      Test step:
      (1) start a VM with virtual igb + intel iommu device

      The related qemu-kvm cmd:

      -machine pc-q35-rhel9.2.0,kernel_irqchip=split \
      -device {"driver":"intel-iommu","id":"iommu0","intremap":"on","caching-mode":true,"eim":"on","device-iotlb":true} \
      -netdev {"type":"tap","fd":"23","id":"hostnet0"} \
      -device {"driver":"igb","netdev":"hostnet0","id":"net0","mac":"52:54:00:00:94:94","bus":"pci.1","addr":"0x0"} \
      

      The related xml:

         <features>
          <ioapic driver='qemu'/>
          </features>
      
         <interface type='bridge'>
            <mac address='52:54:00:00:94:94'/>
            <source bridge='switch'/>
            <target dev='vnet0'/>
            <model type='igb'/>
            <alias name='net0'/>
          </interface>
      
          <iommu model='intel'>
            <driver intremap='on' caching_mode='on' eim='on' iotlb='on'/>
          </iommu>
      
      

      (2) check the PF in the VM

      # lshw -c network -businfo
      Bus info          Device     Class          Description
      =======================================================
      pci@0000:01:00.0  enp1s0     network        82576 Gigabit Network Connection
      
      # ifconfig 
      enp1s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
              inet 10.73.213.100  netmask 255.255.254.0  broadcast 10.73.213.255
              inet6 fe80::8f91:e353:dc04:74bd  prefixlen 64  scopeid 0x20<link>
              inet6 2620:52:0:49d4:16c4:6b14:158e:24ce  prefixlen 64  scopeid 0x0<global>
              ether 52:54:00:00:94:94  txqueuelen 1000  (Ethernet)
              RX packets 2788  bytes 179746 (175.5 KiB)
              RX errors 48  dropped 0  overruns 0  frame 48
              TX packets 219  bytes 28639 (27.9 KiB)
              TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
              device memory 0xfe800000-fe81ffff  
      
      # cat /sys/bus/pci/devices/0000\:01\:00.0/sriov_totalvfs 
      7
      

      (3) create 2 VFs in the VM

      # echo 2 > /sys/bus/pci/devices/0000\:01\:00.0/sriov_numvfs
      

      (4) check the VFs in the VM

      # lshw -c network -businfo
      Bus info          Device     Class          Description
      =======================================================
      pci@0000:01:00.0  enp1s0       network        82576 Gigabit Network Connection
      pci@0000:01:10.0  enp1s0v0   network        82576 Virtual Function
      pci@0000:01:10.2  enp1s0v1   network        82576 Virtual Function
      
      # ifconfig 
      enp1s0v0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
              inet 10.73.211.203  netmask 255.255.254.0  broadcast 10.73.211.255
              inet6 fe80::aec1:4a1e:d414:38a0  prefixlen 64  scopeid 0x20<link>
              inet6 2620:52:0:49d2:9f36:566:337:a4a  prefixlen 64  scopeid 0x0<global>
              ether b2:39:23:94:4b:f5  txqueuelen 1000  (Ethernet)
              RX packets 72  bytes 8814 (8.6 KiB)
              RX errors 0  dropped 0  overruns 0  frame 0
              TX packets 16  bytes 1768 (1.7 KiB)
              TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
      enp1s0v1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
              inet 10.73.211.94  netmask 255.255.254.0  broadcast 10.73.211.255
              inet6 2620:52:0:49d2:dc35:73b4:5f3b:544e  prefixlen 64  scopeid 0x0<global>
              inet6 fe80::f58e:330:fdd5:3818  prefixlen 64  scopeid 0x20<link>
              ether 9e:41:e2:e1:ca:92  txqueuelen 1000  (Ethernet)
              RX packets 243  bytes 17349 (16.9 KiB)
              RX errors 0  dropped 0  overruns 0  frame 0
              TX packets 19  bytes 2020 (1.9 KiB)
              TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
      

      (5) make sure the iommu is enabled in the L1 VM kernel option

      [L1]# cat /proc/cmdline
      ... intel_iommu=on

      (6) start a L2 VM with a igbvf

          <hostdev mode='subsystem' type='pci' managed='yes'>
            <driver name='vfio'/>
            <source>
              <address domain='0x0000' bus='0x01' slot='0x10' function='0x0'/>
            </source>
          </hostdev>
      
      

      (7) check the VF in the L2 VM

      enp7s0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
              inet 10.73.211.174  netmask 255.255.254.0  broadcast 10.73.211.255
              inet6 fe80::c1d9:962b:5ccd:29a3  prefixlen 64  scopeid 0x20<link>
              inet6 2620:52:0:49d2:9c9a:2773:18a2:9aa4  prefixlen 64  scopeid 0x0<global>
              ether 16:ed:6d:52:48:76  txqueuelen 1000  (Ethernet)
              RX packets 167  bytes 17164 (16.7 KiB)
              RX errors 0  dropped 0  overruns 0  frame 0
              TX packets 107  bytes 10278 (10.0 KiB)
              TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
      

      (8) do ping tests in the L2 VM

      L2 VF ping L1 VF : PASS

      # ping -c 4 10.73.211.94
      PING 10.73.211.94 (10.73.211.94) 56(84) bytes of data.
      64 bytes from 10.73.211.94: icmp_seq=1 ttl=64 time=0.253 ms
      64 bytes from 10.73.211.94: icmp_seq=2 ttl=64 time=0.249 ms
      64 bytes from 10.73.211.94: icmp_seq=3 ttl=64 time=0.232 ms
      64 bytes from 10.73.211.94: icmp_seq=4 ttl=64 time=0.246 ms
      
      --- 10.73.211.94 ping statistics ---
      4 packets transmitted, 4 received, 0% packet loss, time 3006ms
      rtt min/avg/max/mdev = 0.232/0.245/0.253/0.007 ms
      

      L2 VF ping L1 PF : PASS

      # ping -c 4 10.73.211.188
      PING 10.73.211.188 (10.73.211.188) 56(84) bytes of data.
      64 bytes from 10.73.211.188: icmp_seq=1 ttl=64 time=0.277 ms
      64 bytes from 10.73.211.188: icmp_seq=2 ttl=64 time=0.321 ms
      64 bytes from 10.73.211.188: icmp_seq=3 ttl=64 time=0.305 ms
      64 bytes from 10.73.211.188: icmp_seq=4 ttl=64 time=0.405 ms
      
      --- 10.73.211.188 ping statistics ---
      4 packets transmitted, 4 received, 0% packet loss, time 3005ms
      rtt min/avg/max/mdev = 0.277/0.327/0.405/0.047 ms
      

      L2 VF ping source bridge of host: PASS

      # ping -c 4 10.73.210.54
      PING 10.73.210.54 (10.73.210.54) 56(84) bytes of data.
      64 bytes from 10.73.210.54: icmp_seq=1 ttl=64 time=0.672 ms
      64 bytes from 10.73.210.54: icmp_seq=2 ttl=64 time=0.373 ms
      64 bytes from 10.73.210.54: icmp_seq=3 ttl=64 time=0.220 ms
      64 bytes from 10.73.210.54: icmp_seq=4 ttl=64 time=0.210 ms
      
      --- 10.73.210.54 ping statistics ---
      4 packets transmitted, 4 received, 0% packet loss, time 3005ms
      rtt min/avg/max/mdev = 0.210/0.368/0.672/0.186 ms
      

      (9) check the qemu-kvm and kernel log

      The L2 VM throws some errors when booting and rebooting:

      2023-11-15T10:31:35.698190Z qemu-kvm: vfio: Cannot reset device 0000:01:10.0, depends on group 17 which is not owned.
      2023-11-15T10:31:35.698539Z qemu-kvm: vfio: Cannot reset device 0000:01:10.0, depends on group 17 which is not owned.
      2023-11-15T10:31:39.070516Z qemu-kvm: VFIO_MAP_DMA failed: Invalid argument
      2023-11-15T10:31:39.070573Z qemu-kvm: vfio_dma_map(0x56436cc15ae0, 0x383000004000, 0x4000, 0x7f2330000000) = -22 (Invalid argument)
      ...
      2023-11-15T10:32:15.287986Z qemu-kvm: VFIO_MAP_DMA failed: Invalid argument
      2023-11-15T10:32:15.288027Z qemu-kvm: vfio_dma_map(0x56436cc15ae0, 0x383000001000, 0x3000, 0x7f2328793000) = -22 (Invalid argument)
      

      Expected results
      The L2 VM with a igbvf should not throw any warning or error.

      Actual results
      The L2 VM with a igbvf throw warning and error, like

      2023-11-15T10:31:35.698190Z qemu-kvm: vfio: Cannot reset device 0000:01:10.0, depends on group 17 which is not owned.
      2023-11-15T10:31:35.698539Z qemu-kvm: vfio: Cannot reset device 0000:01:10.0, depends on group 17 which is not owned.
      2023-11-15T10:31:39.070516Z qemu-kvm: VFIO_MAP_DMA failed: Invalid argument
      2023-11-15T10:31:39.070573Z qemu-kvm: vfio_dma_map(0x56436cc15ae0, 0x383000004000, 0x4000, 0x7f2330000000) = -22 (Invalid argument)
      ...
      2023-11-15T10:32:15.287986Z qemu-kvm: VFIO_MAP_DMA failed: Invalid argument
      2023-11-15T10:32:15.288027Z qemu-kvm: vfio_dma_map(0x56436cc15ae0, 0x383000001000, 0x3000, 0x7f2328793000) = -22 (Invalid argument)
      

              aodaki Akihiko Odaki
              yanghliu@redhat.com YangHang Liu
              Akihiko Odaki Akihiko Odaki
              YangHang Liu YangHang Liu
              Votes:
              0 Vote for this issue
              Watchers:
              13 Start watching this issue

                Created:
                Updated:
                Resolved: