Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-6070

[RHEL9.2] fabtests on CXGB4 T6 device results in many core files

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • rhel-9.2.0
    • fabtests
    • None
    • None
    • sst_network_drivers
    • ssg_networking
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem:

      SIGABRT core files were observed during the fabtests on RHEL-9.2.0 Beta compose.

      Version-Release number of selected component (if applicable):

      Clients: rdma-dev-12
      + [23-03-11 08:56:16] echo 'Servers: rdma-perf-07'
      Servers: rdma-perf-07
      + [23-03-11 08:56:16] RQA_system_info_for_debug
      + [23-03-11 08:56:16] grep -i distro /etc/motd
      + [23-03-11 08:56:16] tr -d ' '
      DISTRO=RHEL-9.2.0-20230309.10
      DISTRO=RHEL-9.2.0-20230309.10
      + [23-03-11 08:56:17] cat /etc/redhat-release
      Red Hat Enterprise Linux release 9.2 Beta (Plow)
      + [23-03-11 08:56:17] uname -a
      Linux rdma-perf-07.rdma.lab.eng.rdu2.redhat.com 5.14.0-284.el9.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Feb 27 20:08:54 EST 2023 x86_64 x86_64 x86_64 GNU/Linux
      + [23-03-11 08:56:17] cat /proc/cmdline
      BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.0-284.el9.x86_64 root=UUID=0ff33732-708d-4764-be0f-7b01b19b4488 ro crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M resume=UUID=5b1f6509-8085-4849-ac79-a928a2aec479 console=ttyS0,115200n81
      + [23-03-11 08:56:17] rpm -q rdma-core linux-firmware
      rdma-core-44.0-2.el9.x86_64
      linux-firmware-20230210-132.el9.noarch
      + [23-03-11 08:56:17] tail /sys/class/infiniband/bnxt_re0/fw_ver /sys/class/infiniband/bnxt_re1/fw_ver /sys/class/infiniband/cxgb4_0/fw_ver /sys/class/infiniband/hfi1_0/fw_ver /sys/class/infiniband/mlx5_0/fw_ver /sys/class/infiniband/mlx5_1/fw_ver
      ==> /sys/class/infiniband/bnxt_re0/fw_ver <==
      214.0.189.0

      ==> /sys/class/infiniband/bnxt_re1/fw_ver <==
      214.0.189.0

      ==> /sys/class/infiniband/cxgb4_0/fw_ver <==
      1.27.1.0

      ==> /sys/class/infiniband/hfi1_0/fw_ver <==
      1.27.0

      ==> /sys/class/infiniband/mlx5_0/fw_ver <==
      16.24.1000

      ==> /sys/class/infiniband/mlx5_1/fw_ver <==
      16.24.1000
      + [23-03-11 08:56:17] grep -i -e ethernet -e infiniband -e omni -e ConnectX
      + [23-03-11 08:56:17] lspci
      01:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
      01:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720 Gigabit Ethernet PCIe
      19:00.0 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01)
      19:00.1 Ethernet controller: Broadcom Inc. and subsidiaries BCM57412 NetXtreme-E 10Gb RDMA Ethernet Controller (rev 01)
      5e:00.0 Ethernet controller: Chelsio Communications Inc T62100-LP-CR Unified Wire Ethernet Controller
      5e:00.1 Ethernet controller: Chelsio Communications Inc T62100-LP-CR Unified Wire Ethernet Controller
      5e:00.2 Ethernet controller: Chelsio Communications Inc T62100-LP-CR Unified Wire Ethernet Controller
      5e:00.3 Ethernet controller: Chelsio Communications Inc T62100-LP-CR Unified Wire Ethernet Controller
      5e:00.4 Ethernet controller: Chelsio Communications Inc T62100-LP-CR Unified Wire Ethernet Controller
      af:00.0 Infiniband controller: Mellanox Technologies MT27800 Family [ConnectX-5]
      af:00.1 Infiniband controller: Mellanox Technologies MT27800 Family [ConnectX-5]
      d8:00.0 Fabric controller: Intel Corporation Omni-Path HFI Silicon 100 Series [discrete] (rev 11)

      Installed:
      fabtests-1.17.0-2.el9.x86_64
      python3-attrs-20.3.0-7.el9.noarch
      python3-iniconfig-1.1.1-7.el9.noarch
      python3-packaging-20.9-5.el9.noarch
      python3-pluggy-0.13.1-7.el9.noarch
      python3-py-1.10.0-6.el9.noarch
      python3-pyparsing-2.4.7-9.el9.noarch
      python3-pytest-6.2.2-6.el9.noarch
      python3-toml-0.10.2-6.el9.noarch
      ruby-3.0.4-160.el9_0.x86_64
      ruby-default-gems-3.0.4-160.el9_0.noarch
      ruby-libs-3.0.4-160.el9_0.x86_64
      rubygem-bigdecimal-3.0.0-160.el9_0.x86_64
      rubygem-bundler-2.2.33-160.el9_0.noarch
      rubygem-io-console-0.5.7-160.el9_0.x86_64
      rubygem-json-2.5.1-160.el9_0.x86_64
      rubygem-psych-3.3.2-160.el9_0.x86_64
      rubygem-rdoc-6.3.3-160.el9_0.noarch
      rubygems-3.2.33-160.el9_0.noarch

      How reproducible:

      100%

      Steps to Reproduce:
      1. run the fatests on CXGB4 IW with the above packages
      2.
      3.

      Actual results:

      The following cores were created on rdma-perf-07 ( RDMA server host ); none on rdma-dev-12 ( RDMA client host ).

      TIME PID UID GID SIG COREFILE EXE SIZE
      Sat 2023-03-11 09:59:43 EST 91423 0 0 SIGABRT present /usr/bin/fi_multi_ep 717.6K
      Sat 2023-03-11 09:59:59 EST 91638 0 0 SIGABRT present /usr/bin/fi_unexpected_msg 275.5K
      Sat 2023-03-11 10:04:02 EST 94475 0 0 SIGABRT present /usr/bin/fi_multinode 276.8K
      Sat 2023-03-11 10:04:10 EST 94542 0 0 SIGABRT present /usr/bin/fi_multinode 276.1K
      Sat 2023-03-11 10:04:23 EST 94746 0 0 SIGABRT present /usr/bin/fi_av_xfer 274.5K
      Sat 2023-03-11 10:04:43 EST 94855 0 0 SIGABRT present /usr/bin/fi_av_xfer 274.9K
      Sat 2023-03-11 10:05:08 EST 95047 0 0 SIGABRT present /usr/bin/fi_cq_data 275.0K
      Sat 2023-03-11 10:05:29 EST 95155 0 0 SIGABRT present /usr/bin/fi_cq_data 273.6K
      Sat 2023-03-11 10:05:50 EST 95207 0 0 SIGABRT present /usr/bin/fi_dgram 273.8K
      Sat 2023-03-11 10:06:10 EST 95316 0 0 SIGABRT present /usr/bin/fi_dgram_waitset 274.7K
      Sat 2023-03-11 10:06:36 EST 95465 0 0 SIGABRT present /usr/bin/fi_dgram_pingpong 306.4K
      Sat 2023-03-11 10:06:56 EST 95563 0 0 SIGABRT present /usr/bin/fi_multi_mr 307.5K
      Sat 2023-03-11 10:07:35 EST 96098 0 0 SIGABRT present /usr/bin/fi_msg_inject 305.1K
      Sat 2023-03-11 10:07:56 EST 96147 0 0 SIGABRT present /usr/bin/fi_msg_inject 307.1K
      Sat 2023-03-11 10:08:16 EST 96252 0 0 SIGABRT present /usr/bin/fi_msg_inject 306.9K
      Sat 2023-03-11 10:08:36 EST 96359 0 0 SIGABRT present /usr/bin/fi_msg_inject 304.4K
      Sat 2023-03-11 10:08:56 EST 96408 0 0 SIGABRT present /usr/bin/fi_multi_recv 275.6K
      Sat 2023-03-11 10:09:19 EST 96555 0 0 SIGABRT present /usr/bin/fi_poll 273.4K
      Sat 2023-03-11 10:09:39 EST 96662 0 0 SIGABRT present /usr/bin/fi_poll 273.8K
      Sat 2023-03-11 10:10:00 EST 96771 0 0 SIGABRT present /usr/bin/fi_rdm 273.9K
      Sat 2023-03-11 10:10:22 EST 96822 0 0 SIGABRT present /usr/bin/fi_bw 308.3K
      Sat 2023-03-11 10:10:42 EST 96929 0 0 SIGABRT present /usr/bin/fi_rdm_atomic 305.9K
      Sat 2023-03-11 10:11:07 EST 97118 0 0 SIGABRT present /usr/bin/fi_rdm_pingpong 306.6K
      Sat 2023-03-11 10:11:27 EST 97169 0 0 SIGABRT present /usr/bin/fi_rdm_pingpong 308.1K
      Sat 2023-03-11 10:11:47 EST 97276 0 0 SIGABRT present /usr/bin/fi_rdm_pingpong 307.6K
      Sat 2023-03-11 10:12:10 EST 97424 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_pingpong 306.1K
      Sat 2023-03-11 10:12:30 EST 97533 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_bw 306.6K
      Sat 2023-03-11 10:12:53 EST 97623 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_pingpong 307.2K
      Sat 2023-03-11 10:13:13 EST 97730 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_bw 305.7K
      Sat 2023-03-11 10:13:33 EST 97838 0 0 SIGABRT present /usr/bin/fi_rdm_pingpong 306.7K
      Sat 2023-03-11 10:13:54 EST 97887 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_pingpong 306.7K
      Sat 2023-03-11 10:14:14 EST 97996 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_bw 307.6K
      Sat 2023-03-11 10:14:33 EST 98103 0 0 SIGABRT present /usr/bin/fi_rdm 274.0K
      Sat 2023-03-11 10:14:54 EST 98154 0 0 SIGABRT present /usr/bin/fi_bw 309.1K
      Sat 2023-03-11 10:15:15 EST 98259 0 0 SIGABRT present /usr/bin/fi_rdm_atomic 304.4K
      Sat 2023-03-11 10:15:35 EST 98365 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_pingpong 306.4K
      Sat 2023-03-11 10:15:56 EST 98415 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_bw 308.2K
      Sat 2023-03-11 10:16:15 EST 98521 0 0 SIGABRT present /usr/bin/fi_rdm_rma_event 275.8K
      Sat 2023-03-11 10:16:36 EST 98628 0 0 SIGABRT present /usr/bin/fi_rdm_rma_trigger 274.7K
      Sat 2023-03-11 10:16:55 EST 98678 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_peek 274.5K
      Sat 2023-03-11 10:17:16 EST 98785 0 0 SIGABRT present /usr/bin/fi_rdm_shared_av 274.2K
      Sat 2023-03-11 10:18:00 EST 98993 0 0 SIGABRT present /usr/bin/fi_recv_cancel 306.4K
      Sat 2023-03-11 10:18:26 EST 99168 0 0 SIGABRT present /usr/bin/fi_rma_bw 305.8K
      Sat 2023-03-11 10:18:46 EST 99274 0 0 SIGABRT present /usr/bin/fi_rma_bw 307.1K
      Sat 2023-03-11 10:19:07 EST 99380 0 0 SIGABRT present /usr/bin/fi_rma_bw 305.8K
      Sat 2023-03-11 10:19:33 EST 99611 0 0 SIGABRT present /usr/bin/fi_rma_bw 306.2K
      Sat 2023-03-11 10:19:53 EST 99661 0 0 SIGABRT present /usr/bin/fi_rma_bw 305.4K
      Sat 2023-03-11 10:20:14 EST 99768 0 0 SIGABRT present /usr/bin/fi_rma_bw 305.9K
      Sat 2023-03-11 10:20:45 EST 99959 0 0 SIGABRT present /usr/bin/fi_unexpected_msg 273.8K
      Sat 2023-03-11 14:44:17 EST 227127 0 0 SIGABRT present /usr/bin/fi_multi_ep 715.5K
      Sat 2023-03-11 14:44:33 EST 227310 0 0 SIGABRT present /usr/bin/fi_unexpected_msg 275.7K
      Sat 2023-03-11 14:48:33 EST 229542 0 0 SIGABRT present /usr/bin/fi_multinode 275.7K
      Sat 2023-03-11 14:48:41 EST 229587 0 0 SIGABRT present /usr/bin/fi_multinode 275.5K
      Sat 2023-03-11 14:48:54 EST 229799 0 0 SIGABRT present /usr/bin/fi_av_xfer 272.6K
      Sat 2023-03-11 14:49:14 EST 229851 0 0 SIGABRT present /usr/bin/fi_av_xfer 273.6K
      Sat 2023-03-11 14:49:39 EST 230029 0 0 SIGABRT present /usr/bin/fi_cq_data 274.5K
      Sat 2023-03-11 14:49:59 EST 230133 0 0 SIGABRT present /usr/bin/fi_cq_data 274.9K
      Sat 2023-03-11 14:50:20 EST 230238 0 0 SIGABRT present /usr/bin/fi_dgram 273.6K
      Sat 2023-03-11 14:50:41 EST 230287 0 0 SIGABRT present /usr/bin/fi_dgram_waitset 276.9K
      Sat 2023-03-11 14:51:06 EST 230426 0 0 SIGABRT present /usr/bin/fi_dgram_pingpong 306.5K
      Sat 2023-03-11 14:51:27 EST 230568 0 0 SIGABRT present /usr/bin/fi_multi_mr 307.2K
      Sat 2023-03-11 14:52:06 EST 231004 0 0 SIGABRT present /usr/bin/fi_msg_inject 306.7K
      Sat 2023-03-11 14:52:26 EST 231108 0 0 SIGABRT present /usr/bin/fi_msg_inject 305.3K
      Sat 2023-03-11 14:52:46 EST 231156 0 0 SIGABRT present /usr/bin/fi_msg_inject 305.1K
      Sat 2023-03-11 14:53:07 EST 231261 0 0 SIGABRT present /usr/bin/fi_msg_inject 305.6K
      Sat 2023-03-11 14:53:27 EST 231363 0 0 SIGABRT present /usr/bin/fi_multi_recv 274.6K
      Sat 2023-03-11 14:53:49 EST 231505 0 0 SIGABRT present /usr/bin/fi_poll 273.8K
      Sat 2023-03-11 14:54:09 EST 231552 0 0 SIGABRT present /usr/bin/fi_poll 274.7K
      Sat 2023-03-11 14:54:30 EST 231658 0 0 SIGABRT present /usr/bin/fi_rdm 273.8K
      Sat 2023-03-11 14:54:52 EST 231762 0 0 SIGABRT present /usr/bin/fi_bw 308.9K
      Sat 2023-03-11 14:55:12 EST 231808 0 0 SIGABRT present /usr/bin/fi_rdm_atomic 304.7K
      Sat 2023-03-11 14:55:37 EST 231986 0 0 SIGABRT present /usr/bin/fi_rdm_pingpong 307.2K
      Sat 2023-03-11 14:55:57 EST 232089 0 0 SIGABRT present /usr/bin/fi_rdm_pingpong 307.0K
      Sat 2023-03-11 14:56:17 EST 232139 0 0 SIGABRT present /usr/bin/fi_rdm_pingpong 308.0K
      Sat 2023-03-11 14:56:40 EST 232282 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_pingpong 306.4K
      Sat 2023-03-11 14:57:00 EST 232383 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_bw 305.6K
      Sat 2023-03-11 14:57:22 EST 232525 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_pingpong 307.0K
      Sat 2023-03-11 14:57:43 EST 232571 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_bw 307.1K
      Sat 2023-03-11 14:58:03 EST 232674 0 0 SIGABRT present /usr/bin/fi_rdm_pingpong 307.6K
      Sat 2023-03-11 14:58:24 EST 232779 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_pingpong 305.8K
      Sat 2023-03-11 14:58:44 EST 232826 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_bw 306.6K
      Sat 2023-03-11 14:59:03 EST 232930 0 0 SIGABRT present /usr/bin/fi_rdm 275.2K
      Sat 2023-03-11 14:59:24 EST 233034 0 0 SIGABRT present /usr/bin/fi_bw 309.6K
      Sat 2023-03-11 14:59:45 EST 233082 0 0 SIGABRT present /usr/bin/fi_rdm_atomic 305.6K
      Sat 2023-03-11 15:00:05 EST 233185 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_pingpong 307.1K
      Sat 2023-03-11 15:00:26 EST 233288 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_bw 307.6K
      Sat 2023-03-11 15:00:45 EST 233336 0 0 SIGABRT present /usr/bin/fi_rdm_rma_event 273.0K
      Sat 2023-03-11 15:01:06 EST 233438 0 0 SIGABRT present /usr/bin/fi_rdm_rma_trigger 276.0K
      Sat 2023-03-11 15:01:25 EST 233552 0 0 SIGABRT present /usr/bin/fi_rdm_tagged_peek 273.7K
      Sat 2023-03-11 15:01:46 EST 233601 0 0 SIGABRT present /usr/bin/fi_rdm_shared_av 274.8K
      Sat 2023-03-11 15:02:29 EST 233802 0 0 SIGABRT present /usr/bin/fi_recv_cancel 305.5K
      Sat 2023-03-11 15:02:56 EST 234013 0 0 SIGABRT present /usr/bin/fi_rma_bw 305.3K
      Sat 2023-03-11 15:03:16 EST 234060 0 0 SIGABRT present /usr/bin/fi_rma_bw 305.7K
      Sat 2023-03-11 15:03:36 EST 234163 0 0 SIGABRT present /usr/bin/fi_rma_bw 305.4K
      Sat 2023-03-11 15:04:03 EST 234376 0 0 SIGABRT present /usr/bin/fi_rma_bw 307.3K
      Sat 2023-03-11 15:04:23 EST 234481 0 0 SIGABRT present /usr/bin/fi_rma_bw 305.5K
      Sat 2023-03-11 15:04:43 EST 234526 0 0 SIGABRT present /usr/bin/fi_rma_bw 305.5K
      Sat 2023-03-11 15:05:14 EST 234700 0 0 SIGABRT present /usr/bin/fi_unexpected_msg 273.9K
      total 30224

      Expected results:

      On the client host, rdma-dev-12,

      ######Start to check if there's coredump######
      There's no core dump.
      ######End to check coredump######

      Additional info:

            mschmidt@redhat.com Michal Schmidt
            bchae Brian Chae
            Kamal Heib Kamal Heib
            infiniband-qe infiniband-qe infiniband-qe infiniband-qe
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: