-
Bug
-
Resolution: Won't Do
-
Undefined
-
None
-
False
-
False
-
None
+++ This bug was initially created as a clone of Bug #2056406 +++
Description of problem:
[RHEL9] NVMe/IB: WARNING at kernel/dma/debug.c:570 add_dma_entry+0x3e3/0x530 observed on host side after connect
Version-Release number of selected component (if applicable):
5.14.0-65.el9.x86_64+debug
How reproducible:
100%
Steps to Reproduce:
1.
2.
3.
Actual results:
Expected results:
Additional info:
THis also can be reproduced on upstream 5.17.0-rc3
HW
- lspci | grep -i mel
04:00.0 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
04:00.1 Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
[ 302.279392] nvme nvme0: creating 40 I/O queues.
[ 302.285025] -----------[ cut here ]-----------
[ 302.290974] DMA-API: mlx5_core 0000:04:00.0: cacheline tracking EEXIST, overlapping mappings aren't supported
[ 302.302236] WARNING: CPU: 31 PID: 2271 at kernel/dma/debug.c:570 add_dma_entry+0x3e3/0x530
[ 302.311586] Modules linked in: nvme_rdma nvme_fabrics nvme_core 8021q garp mrp bonding bridge stp llc rfkill rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod ib_iser libiscsi ib_umad scsi_transport_iscsi rdma_cm iw_cm ib_ipoib ib_cm intel_rapl_msr iTCO_wdt iTCO_vendor_support dcdbas intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate intel_uncore pcspkr mxm_wmi lpc_ich mei_me mei mlx5_ib ib_uverbs ib_core ipmi_ssif ipmi_si ipmi_devintf ipmi_msghandler acpi_power_meter xfs libcrc32c sd_mod t10_pi sg mlx5_core mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec mlxfw ahci pci_hyperv_intf drm crc32c_intel libahci tls tg3 libata i2c_algo_bit psample wmi dm_mirror dm_region_hash dm_log dm_mod
[ 302.396245] CPU: 31 PID: 2271 Comm: kworker/u80:11 Tainted: G S --------- — 5.14.0-65.el9.x86_64+debug #1
[ 302.408747] Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 2.9.1 12/07/2018
[ 302.417230] Workqueue: ib_addr process_one_req [ib_core]
[ 302.423300] RIP: 0010:add_dma_entry+0x3e3/0x530
[ 302.428446] Code: 00 00 4d 8b 74 24 50 4d 85 f6 0f 84 9a 00 00 00 4c 89 e7 e8 1f ad 04 01 48 89 c6 4c 89 f2 48 c7 c7 20 85 2c b4 e8 d9 5b c9 01 <0f> 0b 48 85 ed 0f 85 95 b0 ca 01 8b 05 cc 30 c7 03 85 c0 0f 85 5f
[ 302.449661] RSP: 0018:ffffc9000e6c7820 EFLAGS: 00010286
[ 302.455591] RAX: 0000000000000000 RBX: 00000000ffffffff RCX: 0000000000000000
[ 302.463672] RDX: 0000000000000001 RSI: ffffffffb447fe00 RDI: fffff52001cd8ef6
[ 302.471753] RBP: ffff8881063afd00 R08: 0000000000000001 R09: ffff88a03f5ed787
[ 302.479832] R10: ffffed1407ebdaf0 R11: 0000000000000001 R12: ffff8890880880d0
[ 302.487912] R13: 1ffff92001cd8f06 R14: ffff8890d0e73480 R15: 0000000000000202
[ 302.495995] FS: 0000000000000000(0000) GS:ffff88a03f400000(0000) knlGS:0000000000000000
[ 302.505154] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 302.511668] CR2: 0000152f6c603d34 CR3: 00000013cc42c002 CR4: 00000000001706e0
[ 302.519749] Call Trace:
[ 302.522543] ? check_sync+0x1a60/0x1a60
[ 302.526913] ? debug_dma_map_page+0x24b/0x300
[ 302.533829] dma_map_page_attrs+0xca/0x190
[ 302.540375] nvme_rdma_alloc_qe+0x12b/0x410 [nvme_rdma]
[ 302.548191] nvme_rdma_create_queue_ib+0x4a2/0xa50 [nvme_rdma]
[ 302.556694] nvme_rdma_cm_handler+0x121/0xa4d [nvme_rdma]
[ 302.564649] ? nvme_rdma_create_ctrl+0xa80/0xa80 [nvme_rdma]
[ 302.572853] ? lock_downgrade+0x130/0x130
[ 302.579260] cma_cm_event_handler+0xf2/0x520 [rdma_cm]
[ 302.586917] addr_handler+0x18e/0x2b0 [rdma_cm]
[ 302.593909] ? cma_work_handler+0x1c0/0x1c0 [rdma_cm]
[ 302.601510] ? sched_clock_cpu+0x15/0x1b0
[ 302.607955] ? rcu_read_unlock+0x40/0x40
[ 302.614299] ? cma_work_handler+0x1c0/0x1c0 [rdma_cm]
[ 302.621936] process_one_req+0xe8/0x560 [ib_core]
[ 302.629234] process_one_work+0x8cb/0x1590
[ 302.635812] ? __lock_acquired+0x205/0x890
[ 302.642393] ? pwq_dec_nr_in_flight+0x230/0x230
[ 302.649476] ? __lock_contended+0x980/0x980
[ 302.656193] ? worker_thread+0x157/0x1010
[ 302.662720] worker_thread+0x59b/0x1010
[ 302.669077] ? process_one_work+0x1590/0x1590
[ 302.675969] kthread+0x364/0x420
[ 302.681540] ? _raw_spin_unlock_irq+0x24/0x50
[ 302.688349] ? set_kthread_struct+0x100/0x100
[ 302.695118] ret_from_fork+0x22/0x30
[ 302.700968] irq event stamp: 225303
[ 302.706642] hardirqs last enabled at (225313): [<ffffffffb1f8c357>] __up_console_sem+0x67/0x70
[ 302.718180] hardirqs last disabled at (225322): [<ffffffffb1f8c33c>] __up_console_sem+0x4c/0x70
[ 302.729657] softirqs last enabled at (225250): [<ffffffffb4000621>] __do_softirq+0x621/0x9a4
[ 302.740899] softirqs last disabled at (225245): [<ffffffffb1e08914>] __irq_exit_rcu+0x1f4/0x2a0
[ 302.752290] --[ end trace 55407bd0b4511e67 ]--
[ 302.759055] DMA-API: Mapped at:
[ 302.764108] debug_dma_map_page+0x64/0x300
[ 302.770231] dma_map_page_attrs+0xca/0x190
[ 302.776333] nvme_rdma_alloc_qe+0x12b/0x410 [nvme_rdma]
[ 302.783663] nvme_rdma_create_queue_ib+0x4a2/0xa50 [nvme_rdma]
[ 302.791621] nvme_rdma_cm_handler+0x121/0xa4d [nvme_rdma]
[ 310.190739] DMA-API: dma_debug_entry pool grown to 589824 (900%)
[ 316.840836] nvme nvme0: mapped 40/0/0 default/read/poll queues.
[ 317.019207] nvme nvme0: new ctrl: NQN "testnqn", addr 172.31.0.202:4420
- external trackers
- links to