-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
rhel-9.2.0
-
None
-
Moderate
-
rhel-sst-storage-io
-
ssg_filesystems_storage_and_HA
-
5
-
False
-
-
None
-
None
-
None
-
None
-
If docs needed, set a value
-
-
Unspecified
-
None
Description of problem: When connecting to a Broadcom software target from the initiator using 'nvme connect-all', I am presented with the following output:
- nvme connect-all
Failed to open ctrl nvme6, errno 11
Failed to open ctrl nvme6, errno 11
Segmentation fault (core dumped)
Below are the log messages when the issue occurred:
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: nvme[2026]: segfault at 8 ip 000055ce445d7a41 sp 00007ffda0b65360 error 4 in nvme[55ce445d5000+88000]
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: Code: f2 ff ff 48 8d 74 24 40 ba 0a 00 00 00 48 89 df 48 89 44 24 08 e8 2f f9 ff ff 41 89 c6 85 c0 0f 85 74 01 00 00 4c 8b 7c 24 40 <49> 8b 47 08 48 89 04 24 4d 85 e4 74 6a ba 80 01 00 00 be 42 02 00
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: Created slice Slice /system/systemd-coredump.
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: Started Process Core Dump (PID 2027/UID 0).
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd-coredump[2028]: Resource limits disable core dumping for process 2026 (nvme).
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd-coredump[2028]: [🡕] Process 2026 (nvme) of user 0 dumped core.
Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: systemd-coredump@0-2027-0.service: Deactivated successfully.
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@-device\x3dnone\ttransport\x3dfc\ttraddr\x3dnn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2\ttrsvcid\x3dnone\t-host-traddr\x3dnn-0x20000090fad17933:pn-0x10000090fad17933.service: Deactivated successfully.
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@-device\x3dnone\ttransport\x3dfc\ttraddr\x3dnn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3\ttrsvcid\x3dnone\t-host-traddr\x3dnn-0x20000090fad17933:pn-0x10000090fad17933.service: Deactivated successfully.
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@-device\x3dnone\ttransport\x3dfc\ttraddr\x3dnn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2\ttrsvcid\x3dnone\t-host-traddr\x3dnn-0x20000090fad17934:pn-0x10000090fad17934.service: Deactivated successfully.
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@-device\x3dnone\ttransport\x3dfc\ttraddr\x3dnn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3\ttrsvcid\x3dnone\t-host-traddr\x3dnn-0x20000090fad17934:pn-0x10000090fad17934.service: Deactivated successfully.
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.1: Disconnect LS failed: No Association
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.1: Disconnect LS failed: No Association
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.0: Disconnect LS failed: No Association
Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.0: Disconnect LS failed: No Association
The connection to the nvme namespace appears to establish correctly:
- nvme list
Node Generic SN Model Namespace Usage Format FW Rev-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme1n1 /dev/ng1n1 a79585895d0a700e Linux 1 1.60 TB / 1.60 TB 4 KiB + 0 B 5.14.0-2
- --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Below is my discovery.conf in case it is helpful:
- cat /etc/nvme/discovery.conf
--transport=fc --traddr=nn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2 --host-traddr=nn-0x20000090fad17933:pn-0x10000090fad17933
--transport=fc --traddr=nn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2 --host-traddr=nn-0x20000090fad17934:pn-0x10000090fad17934
--transport=fc --traddr=nn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3 --host-traddr=nn-0x20000090fad17933:pn-0x10000090fad17933
--transport=fc --traddr=nn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3 --host-traddr=nn-0x20000090fad17934:pn-0x10000090fad17934
Version-Release number of selected component (if applicable):
- rpm -qa nvme-cli
nvme-cli-2.2.1-2.el9.x86_64
How reproducible: Often
Steps to Reproduce:
1. see above
Additional info: Below is the job where the issue was first seen:
- is related to
-
RHEL-30641 RHEL9.5: Update libnvme to the latest upstream version
-
- Closed
-
- relates to
-
RHEL-25773 RHEL 9.4 NVMe/FC host had nvme process crash due to a segmentation fault during path recovery
-
- Planning
-
- external trackers