Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-6193

[RHEL8.7] fabtests on QEDR ROCE device result in core files

    • Icon: Bug Bug
    • Resolution: Won't Do
    • Icon: Undefined Undefined
    • None
    • rhel-8.7.0
    • fabtests
    • None
    • None
    • sst_network_drivers
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:

      seqfault is observed in various points when fabtests are tested on QEDR ROCE device.

      Version-Release number of selected component (if applicable):

      Clients: rdma-dev-02
      Servers: rdma-perf-06

      DISTRO=RHEL-8.7.0-20220524.0

      + [22-06-06 21:42:37] cat /etc/redhat-release
      Red Hat Enterprise Linux release 8.7 Beta (Ootpa)

      + [22-06-06 21:42:37] uname -a
      Linux rdma-dev-02.rdma.lab.eng.rdu2.redhat.com 4.18.0-393.el8.x86_64 #1 SMP Wed May 18 12:44:50 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

      + [22-06-06 21:42:37] cat /proc/cmdline
      BOOT_IMAGE=(hd0,msdos1)/vmlinuz-4.18.0-393.el8.x86_64 root=UUID=f3b9d64f-1ad8-44bd-b339-3a4297ae3e9a ro console=tty0 rd_NO_PLYMOUTH intel_iommu=on iommu=on crashkernel=auto resume=UUID=b92a6a91-c13f-46c2-b3b6-e1d187ba4ac3 console=ttyS1,115200

      + [22-06-06 21:42:37] rpm -q rdma-core linux-firmware
      rdma-core-37.2-1.el8.x86_64
      linux-firmware-20220210-107.git6342082c.el8.noarch

      + [22-06-06 21:42:37] tail /sys/class/infiniband/qedr0/fw_ver /sys/class/infiniband/qedr1/fw_ver
      ==> /sys/class/infiniband/qedr0/fw_ver <==
      8. 59. 1. 0

      ==> /sys/class/infiniband/qedr1/fw_ver <==
      8. 59. 1. 0

      + [22-06-06 21:42:37] lspci
      + [22-06-06 21:42:37] grep -i -e ethernet -e infiniband -e omni -e ConnectX
      02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
      02:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
      08:00.0 Ethernet controller: QLogic Corp. FastLinQ QL45000 Series 25GbE Controller (rev 10)
      08:00.1 Ethernet controller: QLogic Corp. FastLinQ QL45000 Series 25GbE Controller (rev 10)

      + [22-06-06 21:53:10] grep psm
      psmisc-23.1-5.el8.x86_64
      libpsm2-11.2.206-1.el8.x86_64

      + [22-06-06 21:53:11] rpm -qa
      + [22-06-06 21:53:11] grep libibverbs
      libibverbs-37.2-1.el8.x86_64
      libibverbs-utils-37.2-1.el8.x86_64

      + [22-06-06 21:53:12] RQA_pkg_install fabtests
      + [22-06-06 21:53:12] PKG_LIST=
      + [22-06-06 21:53:12] for p in "$@"
      + [22-06-06 21:53:12] rpm -q fabtests
      fabtests-1.14.0-1.el8.x86_64

      How reproducible:

      100%

      Steps to Reproduce:

      1. run the fatests on QEDE IW with the above packages
      2.
      3.

      Actual results:

      TIME PID UID GID SIG COREFILE EXE
      Mon 2022-06-06 22:04:57 EDT 75235 0 0 11 present /usr/bin/fi_poll
      Mon 2022-06-06 22:05:01 EDT 75280 0 0 11 present /usr/bin/fi_poll
      Tue 2022-06-07 00:15:14 EDT 117329 0 0 11 present /usr/bin/fi_poll
      Tue 2022-06-07 00:15:18 EDT 117387 0 0 11 present /usr/bin/fi_poll
      total 936
      rw-r----. 1 root root 233046 Jun 7 00:15 core.fi_poll.0.92cf828b70574a22b559e34b049b91c9.117329.1654575314000000.lz4
      rw-r----. 1 root root 232933 Jun 7 00:15 core.fi_poll.0.92cf828b70574a22b559e34b049b91c9.117387.1654575317000000.lz4
      rw-r----. 1 root root 233706 Jun 6 22:04 core.fi_poll.0.92cf828b70574a22b559e34b049b91c9.75235.1654567497000000.lz4
      rw-r----. 1 root root 233787 Jun 6 22:05 core.fi_poll.0.92cf828b70574a22b559e34b049b91c9.75280.1654567501000000.lz4

      Jun 06 22:04:57 rdma-dev-02.rdma.lab.eng.rdu2.redhat.com kernel: fi_poll[75235]: segfault at 18 ip 000055ad78c06fd1 sp 00007ffc716d0510 error 4 in fi_poll[55ad78c05000+11000]

      Jun 06 22:05:01 rdma-dev-02.rdma.lab.eng.rdu2.redhat.com kernel: fi_poll[75280]: segfault at 18 ip 0000557945fd4fd1 sp 00007ffef3c7b910 error 4 in fi_poll[557945fd3000+11000]

      Expected results:

      Normal completion without any segfault

      Additional info:

            mschmidt@redhat.com Michal Schmidt
            bchae Brian Chae
            Kamal Heib Kamal Heib
            infiniband-qe infiniband-qe infiniband-qe infiniband-qe
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: