Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-2579

Test Coverage: i40e/ixgbe/mlx5_core driver: The SRIOV dpdk pvp performance of multiple queues is lower than one queue

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • ovs-dpdk

      This task is tracking the test case writing activities to cover the bug described below.

      Description of problem:

      Version-Release number of selected component (if applicable):
      both rhel8.6 and rhel9.2
      dpdk-22.11-4.el9
      dpdk-21.11-3.el8 

      How reproducible:

      Run sriov dpdk pvp 1queue,2queue,4queue performance case 
      Steps to Reproduce:
      create 1vf for per pf
      start guest with following xml (1 queue case)

      <domain type='kvm'>
        <name>g1</name>
        <memory unit='KiB'>8388608</memory>
        <currentMemory unit='KiB'>8388608</currentMemory>
        <memoryBacking>
          <hugepages>
            <page size='1048576' unit='KiB'/>
          </hugepages>
          <locked/>
          <access mode='shared'/>
        </memoryBacking>
        <vcpu placement='static'>3</vcpu>
        <cputune>
          <vcpupin vcpu='0' cpuset='4'/>
          <vcpupin vcpu='1' cpuset='30'/>
          <vcpupin vcpu='2' cpuset='2'/>
          <emulatorpin cpuset='0,28'/>
        </cputune>
        <numatune>
          <memory mode='strict' nodeset='0'/>
        </numatune>
        <resource>
          <partition>/machine</partition>
        </resource>
        <os>
          <type arch='x86_64' machine='q35'>hvm</type>
          <boot dev='hd'/>
        </os>
        <features>
          <acpi/>
          <apic/>
          <pmu state='off'/>
          <vmport state='off'/>
          <ioapic driver='qemu'/>
        </features>
        <cpu mode='host-passthrough' check='none'>
          <feature policy='require' name='tsc-deadline'/>
          <numa>
            <cell id='0' cpus='0-2' memory='8388608' unit='KiB' memAccess='shared'/>
          </numa>
        </cpu>
        <clock offset='utc'>
          <timer name='rtc' tickpolicy='catchup'/>
          <timer name='pit' tickpolicy='delay'/>
          <timer name='hpet' present='no'/>
        </clock>
        <on_poweroff>destroy</on_poweroff>
        <on_reboot>restart</on_reboot>
        <on_crash>restart</on_crash>
        <pm>
          <suspend-to-mem enabled='no'/>
          <suspend-to-disk enabled='no'/>
        </pm>
        <devices>
          <emulator>/usr/libexec/qemu-kvm</emulator>
          <disk type='file' device='disk'>
            <driver name='qemu' type='qcow2'/>
            <source file='/var/lib/libvirt/images/g1.qcow2'/>
            <backingStore/>
            <target dev='vda' bus='virtio'/>
            <alias name='virtio-disk0'/>
            <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
          </disk>
          <controller type='usb' index='0' model='none'>
            <alias name='usb'/>
          </controller>
          <controller type='pci' index='0' model='pcie-root'>
            <alias name='pcie.0'/>
          </controller>
          <controller type='pci' index='1' model='pcie-root-port'>
            <model name='pcie-root-port'/>
            <target chassis='1' port='0x10'/>
            <alias name='pci.1'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
          </controller>
          <controller type='pci' index='2' model='pcie-root-port'>
            <model name='pcie-root-port'/>
            <target chassis='2' port='0x11'/>
            <alias name='pci.2'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
          </controller>
          <controller type='pci' index='3' model='pcie-root-port'>
            <model name='pcie-root-port'/>
            <target chassis='3' port='0x8'/>
            <alias name='pci.3'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
          </controller>
          <controller type='pci' index='4' model='pcie-root-port'>
            <model name='pcie-root-port'/>
            <target chassis='4' port='0x9'/>
            <alias name='pci.4'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
          </controller>
          <controller type='pci' index='5' model='pcie-root-port'>
            <model name='pcie-root-port'/>
            <target chassis='5' port='0xa'/>
            <alias name='pci.5'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
          </controller>
          <controller type='pci' index='6' model='pcie-root-port'>
            <model name='pcie-root-port'/>
            <target chassis='6' port='0xb'/>
            <alias name='pci.6'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
          </controller>
          <controller type='sata' index='0'>
            <alias name='ide'/>
            <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
          </controller>
          <interface type='bridge'>
            <mac address='52:54:00:01:02:03'/>
            <source bridge='virbr0'/>
            <model type='virtio'/>
          </interface>
      <hostdev mode='subsystem' type='pci' managed='yes'>
        <source>
          <address type='pci' domain='0x0000' bus='0x07' slot='0x02' function='0x0'/>
        </source>
        <mac address='00:de:ad:01:01:01'/>
      </hostdev>
      <hostdev mode='subsystem' type='pci' managed='yes'>
        <source>
          <address type='pci' domain='0x0000' bus='0x07' slot='0x0a' function='0x0'/>
        </source>
        <mac address='00:de:ad:02:02:02'/>
      </hostdev>
          <serial type='pty'>
            <source path='/dev/pts/1'/>
            <target type='isa-serial' port='0'>
              <model name='isa-serial'/>
            </target>
            <alias name='serial0'/>
          </serial>
          <console type='pty' tty='/dev/pts/1'>
            <source path='/dev/pts/1'/>
            <target type='serial' port='0'/>
            <alias name='serial0'/>
          </console>
          <input type='mouse' bus='ps2'>
            <alias name='input0'/>
          </input>
          <input type='keyboard' bus='ps2'>
            <alias name='input1'/>
          </input>
          <graphics type='vnc' port='5900' autoport='yes' listen='0.0.0.0'>
            <listen type='address' address='0.0.0.0'/>
          </graphics>
          <video>
            <model type='cirrus' vram='16384' heads='1' primary='yes'/>
            <alias name='video0'/>
            <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
          </video>
          <memballoon model='virtio'>
            <alias name='balloon0'/>
            <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
          </memballoon>
          <iommu model='intel'>
            <driver intremap='on' caching_mode='on' iotlb='on'/>
          </iommu>
        </devices>
        <seclabel type='dynamic' model='selinux' relabel='yes'/>
      </domain> 

      inside guest:
      #bind the port to vfio-pci and start testpmd with the two ports:
      #dpdk-testpmd l 0-2 -n 1 --socket-mem 1024 – -i --forward-mode=mac --burst=32 --rxd=4096 --txd=4096 --max-pkt-len=9200 --mbuf-size=9728 --nb-cores=2 --rxq=1 --txq=1 --eth-peer=0,00:00:00:00:00:01 --eth-peer=1,00:00:00:00:00:02 --mbcache=512  -auto-start

       
      On the trex server:
      send the traffic:
      ./binary-search.py --traffic-generator=trex-txrx --frame-size=64 --num-flows=1024 --max-loss-pct=0 --search-runtime=10 --validation-runtime=60 --rate-tolerance=10 --runtime-tolerance=10 --rate=25 --rate-unit=% --duplicate-packet-failure=retry-to-fail --negative-packet-loss=retry-to-fail --warmup-trial --warmup-trial-runtime=10 --rate=25 --rate-unit=% --one-shot=0 --use-src-ip-flows=1 --use-dst-ip-flows=1 --use-src-mac-flows=0 --use-dst-mac-flows=0 --src-macs=00:00:00:00:00:01,00:00:00:00:00:02 --dst-macs=00:de:ad:01:01:01,00:de:ad:02:02:02 --send-teaching-measurement --send-teaching-warmup --teaching-warmup-packet-type=generic --teaching-measurement-packet-type=generic --teaching-warmup-packet-rate=1000

      Actual results:
      rhel8.6 job:
      https://beaker.engineering.redhat.com/jobs/8976575
      https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2024/02/89765/8976575/15651085/174258737/i40e_25.html
      https://beaker.engineering.redhat.com/jobs/8975776
      https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2024/02/89757/8975776/15650001/174248376/i40e_25.html
      1queue: 34.6mpps
      2queue: 33mpps
      4queue: 30mpps

      rhel9.2 job:
      https://beaker.engineering.redhat.com/jobs/8975523
      https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2024/02/89755/8975523/15649721/174247540/i40e_25.html
      1queue: 36.2mpps
      2queue: 33mpps
      4queue: 29.9mpps

      Expected results:
      The 2queue and 4queue case got higher performance than 1 queue case.
      For ice driver, 2queue and 4queue case got higher performance than 1 queue case. It has no this issue.
      ice driver:
      1queue: 36.2mpps
      2queue: 37.3mpps
      4queue: 37.3mpps
      https://beaker.engineering.redhat.com/jobs/8979161
      https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2024/03/89791/8979161/15655476/174289204/ice_25.html

      For ixgbe driver, 1queue and 2queue case got the similar performance, but 4queue also got the lower performance.
      ixgbe driver:
      1queue: 23mpps
      2queue: 23mpps
      4queue: 21mpps
      https://beaker.engineering.redhat.com/jobs/8975525
      https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2024/02/89755/8975525/15649724/174247548/ixgbe_10.html
      https://beaker.engineering.redhat.com/jobs/8975782
      https://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2024/02/89757/8975782/15650008/174248403/ixgbe_10.html

              ovsdpdk-triage ovsdpdk triage
              nstbot NST Bot
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: