Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-6080

[RHEL-8.9] most of qperf tests failed when tested on QEDR iWARP device

Linking RHIVOS CVEs to...Migration: Automation ...SWIFT: POC ConversionSync from "Extern...XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Undefined Undefined
    • None
    • rhel-8.9.0
    • qperf
    • Yes
    • None
    • 1
    • rhel-net-drivers
    • ssg_networking
    • 1
    • False
    • False
    • Hide

      None

      Show
      None
    • None
    • Network Drivers 6
    • None
    • None
    • If docs needed, set a value
    • None
    • 57,005

      Description of problem:

      The following qperf tests failed when tested on QEDR iWARP.

      FAIL | 1 | rc_bi_bw
      FAIL | 1 | rc_bw
      FAIL | 1 | rc_lat
      FAIL | 1 | rc_rdma_read_bw
      FAIL | 1 | rc_compare_swap_mr
      FAIL | 1 | rc_fetch_add_mr
      FAIL | 1 | ver_rc_compare_swap
      FAIL | 1 | ver_rc_fetch_add

      These are regressions from RHEL-8.8.0-20230228.22.

      Version-Release number of selected component (if applicable):

      Clients: rdma-dev-03
      Servers: rdma-dev-02

      DISTRO=RHEL-8.9.0-20230531.26

      + [23-05-31 13:53:08] cat /etc/redhat-release
      Red Hat Enterprise Linux release 8.9 Beta (Ootpa)

      + [23-05-31 13:53:08] uname -a
      Linux rdma-dev-03.rdma.lab.eng.rdu2.redhat.com 4.18.0-494.el8.x86_64 #1 SMP Mon May 22 11:16:32 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux

      + [23-05-31 13:53:08] cat /proc/cmdline
      BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-494.el8.x86_64 root=UUID=4c97e3ba-2618-4d9b-80d0-a5c87ef7d0a5 ro console=tty0 rd_NO_PLYMOUTH intel_iommu=on iommu=on crashkernel=auto resume=UUID=b7fec9d8-7600-49eb-b89a-ad19248be0d0 console=ttyS1,115200

      + [23-05-31 13:53:08] rpm -q rdma-core linux-firmware
      rdma-core-44.0-2.el8.1.x86_64
      linux-firmware-20230515-115.gitd1962891.el8.noarch

      + [23-05-31 13:53:08] tail /sys/class/infiniband/qedr0/fw_ver /sys/class/infiniband/qedr1/fw_ver
      ==> /sys/class/infiniband/qedr0/fw_ver <==
      8. 59. 1. 0

      ==> /sys/class/infiniband/qedr1/fw_ver <==
      8. 59. 1. 0

      + [23-05-31 13:53:08] lspci
      + [23-05-31 13:53:08] grep -i -e ethernet -e infiniband -e omni -e ConnectX
      02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
      02:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
      08:00.0 Ethernet controller: QLogic Corp. FastLinQ QL45000 Series 25GbE Controller (rev 10)
      08:00.1 Ethernet controller: QLogic Corp. FastLinQ QL45000 Series 25GbE Controller (rev 10)

      + [23-05-31 13:53:08] rpm -q qperf
      qperf-0.4.11-3.el8.x86_64

      How reproducible:
      100%

      Steps to Reproduce:
      1. On the RDMA server host, issue
      qperf
      2. On the RDMA client host, issue

      qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_bi_bw
      qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_bw
      qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_lat
      qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_rdma_read_bw
      qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_compare_swap_mr
      qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_fetch_add_mr
      qperf -v -i qedr1:1 -cm 1 172.31.50.102 ver_rc_compare_swap
      qperf -v -i qedr1:1 -cm 1 172.31.50.102 ver_rc_fetch_add

      3.

      Actual results:

      + [23-05-31 13:53:15] qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_bi_bw
      server:
      rc_bi_bw:
      warning: -i set but not used in test rc_bi_bw
      rc_bi_bw failed: WR flush failure
      + [23-05-31 13:53:15] RQA_check_result -r 1 -t rc_bi_bw

      + [23-05-31 13:53:15] qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_bw
      server:
      rc_bw:
      warning: -i set but not used in test rc_bw
      + [23-05-31 13:53:15] RQA_check_result -r 1 -t rc_bw

      + [23-05-31 13:53:15] qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_lat
      server:
      rc_lat:
      warning: -i set but not used in test rc_lat
      + [23-05-31 13:53:15] RQA_check_result -r 1 -t rc_lat

      qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_rdma_read_bw
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      rc_rdma_read_bw:
      warning: -i set but not used in test rc_rdma_read_bw
      rc_rdma_read_bw failed: WR flush failure
      + [23-05-31 13:53:15] RQA_check_result -r 1 -t rc_rdma_read_bw

      + [23-05-31 13:53:20] qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_compare_swap_mr
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      rc_compare_swap_mr:
      warning: -i set but not used in test rc_compare_swap_mr
      rc_compare_swap_mr failed: WR flush failure
      + [23-05-31 13:53:20] RQA_check_result -r 1 -t rc_compare_swap_mr

      + [23-05-31 13:53:20] qperf -v -i qedr1:1 -cm 1 172.31.50.102 rc_fetch_add_mr
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      rc_fetch_add_mr:
      warning: -i set but not used in test rc_fetch_add_mr
      rc_fetch_add_mr failed: WR flush failure
      + [23-05-31 13:53:20] RQA_check_result -r 1 -t rc_fetch_add_mr

      + [23-05-31 13:53:20] qperf -v -i qedr1:1 -cm 1 172.31.50.102 ver_rc_compare_swap
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      ver_rc_compare_swap:
      warning: -i set but not used in test ver_rc_compare_swap
      ver_rc_compare_swap failed: WR flush failure
      + [23-05-31 13:53:20] RQA_check_result -r 1 -t ver_rc_compare_swap

      + [23-05-31 13:53:20] qperf -v -i qedr1:1 -cm 1 172.31.50.102 ver_rc_fetch_add
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      [qelr_poll_cq_req:2145]Error: POLL CQ with ROCE_CQE_REQ_STS_WORK_REQUEST_FLUSHED_ERR. QP icid=0x290
      ver_rc_fetch_add:
      warning: -i set but not used in test ver_rc_fetch_add
      ver_rc_fetch_add failed: WR flush failure
      + [23-05-31 13:53:20] RQA_check_result -r 1 -t ver_rc_fetch_add

      Expected results:

      qperf test results from RHEL-8.8.0-20230228.22

      + [23-03-02 17:15:58] qperf -v -i qedr1:1 -cm 1 172.31.50.103 rc_bi_bw
      rc_bi_bw:
      warning: -i set but not used in test rc_bi_bw
      bw = 5.03 GB/sec
      msg_rate = 76.7 K/sec
      use_cm = 1
      loc_cpus_used = 22 % cpus
      rem_cpus_used = 23 % cpus
      + [23-03-02 17:16:00] RQA_check_result -r 0 -t rc_bi_bw

      + [23-03-02 17:16:00] qperf -v -i qedr1:1 -cm 1 172.31.50.103 rc_bw
      rc_bw:
      warning: -i set but not used in test rc_bw
      bw = 2.74 GB/sec
      msg_rate = 41.9 K/sec
      use_cm = 1
      send_cost = 30.6 ms/GB
      recv_cost = 63.8 ms/GB
      send_cpus_used = 8.5 % cpus
      recv_cpus_used = 17.5 % cpus
      + [23-03-02 17:16:02] RQA_check_result -r 0 -t rc_bw

      + [23-03-02 17:16:02] qperf -v -i qedr1:1 -cm 1 172.31.50.103 rc_lat
      rc_lat:
      warning: -i set but not used in test rc_lat
      latency = 11.8 us
      msg_rate = 84.9 K/sec
      use_cm = 1
      loc_cpus_used = 36 % cpus
      rem_cpus_used = 48.5 % cpus
      + [23-03-02 17:16:08] RQA_check_result -r 0 -t rc_lat

      + [23-03-02 17:16:08] qperf -v -i qedr1:1 -cm 1 172.31.50.103 rc_rdma_read_bw
      rc_rdma_read_bw:
      warning: -i set but not used in test rc_rdma_read_bw
      bw = 1.85 GB/sec
      msg_rate = 28.3 K/sec
      use_cm = 1
      recv_cost = 56.6 ms/GB
      recv_cpus_used = 10.5 % cpus
      + [23-03-02 17:16:10] RQA_check_result -r 0 -t rc_rdma_read_bw

      + [23-03-02 17:16:23] qperf -v -i qedr1:1 -cm 1 172.31.50.103 rc_compare_swap_mr
      rc_compare_swap_mr:
      warning: -i set but not used in test rc_compare_swap_mr
      msg_rate = 79.1 K/sec
      use_cm = 1
      send_cost = 432 sec/GB
      recv_cost = 7.9 sec/GB
      send_cpus_used = 27.5 % cpus
      recv_cpus_used = 0.5 % cpus
      + [23-03-02 17:16:25] RQA_check_result -r 0 -t rc_compare_swap_mr

      + [23-03-02 17:16:25] qperf -v -i qedr1:1 -cm 1 172.31.50.103 rc_fetch_add_mr
      rc_fetch_add_mr:
      warning: -i set but not used in test rc_fetch_add_mr
      msg_rate = 79.1 K/sec
      use_cm = 1
      send_cost = 503 sec/GB
      recv_cost = 15.8 sec/GB
      send_cpus_used = 32 % cpus
      recv_cpus_used = 1 % cpus
      + [23-03-02 17:16:27] RQA_check_result -r 0 -t rc_fetch_add_mr

      + [23-03-02 17:16:27] qperf -v -i qedr1:1 -cm 1 172.31.50.103 ver_rc_compare_swap
      ver_rc_compare_swap:
      warning: -i set but not used in test ver_rc_compare_swap
      msg_rate = 79.1 K/sec
      use_cm = 1
      send_cost = 432 sec/GB
      recv_cost = 15.8 sec/GB
      send_cpus_used = 27.5 % cpus
      recv_cpus_used = 1 % cpus
      + [23-03-02 17:16:29] RQA_check_result -r 0 -t ver_rc_compare_swap

      + [23-03-02 17:16:29] qperf -v -i qedr1:1 -cm 1 172.31.50.103 ver_rc_fetch_add
      ver_rc_fetch_add:
      warning: -i set but not used in test ver_rc_fetch_add
      msg_rate = 79.1 K/sec
      use_cm = 1
      send_cost = 463 sec/GB
      recv_cost = 31.6 sec/GB
      send_cpus_used = 29.5 % cpus
      recv_cpus_used = 2 % cpus
      + [23-03-02 17:16:31] RQA_check_result -r 0 -t ver_rc_fetch_add

      Additional info:

              network-drivers-bugs@redhat.com network-drivers-bugs group
              bchae Brian Chae (Inactive)
              RH Bugzilla Integration RH Bugzilla Integration
              infiniband-qe infiniband-qe infiniband-qe infiniband-qe
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: