Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-8056

nvme-cli - 2748 Segmentation fault (core dumped) nvme connect-all

    • None
    • Moderate
    • sst_storage_io
    • ssg_platform_storage
    • 5
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • If docs needed, set a value
    • None

      Description of problem: When connecting to a Broadcom software target from the initiator using 'nvme connect-all', I am presented with the following output:

      1. nvme connect-all
        Failed to open ctrl nvme6, errno 11
        Failed to open ctrl nvme6, errno 11
        Segmentation fault (core dumped)

      Below are the log messages when the issue occurred:

      Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: nvme[2026]: segfault at 8 ip 000055ce445d7a41 sp 00007ffda0b65360 error 4 in nvme[55ce445d5000+88000]
      Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: Code: f2 ff ff 48 8d 74 24 40 ba 0a 00 00 00 48 89 df 48 89 44 24 08 e8 2f f9 ff ff 41 89 c6 85 c0 0f 85 74 01 00 00 4c 8b 7c 24 40 <49> 8b 47 08 48 89 04 24 4d 85 e4 74 6a ba 80 01 00 00 be 42 02 00
      Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: Created slice Slice /system/systemd-coredump.
      Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: Started Process Core Dump (PID 2027/UID 0).
      Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd-coredump[2028]: Resource limits disable core dumping for process 2026 (nvme).
      Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd-coredump[2028]: [🡕] Process 2026 (nvme) of user 0 dumped core.
      Mar 11 11:12:51 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: systemd-coredump@0-2027-0.service: Deactivated successfully.
      Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@-device\x3dnone\ttransport\x3dfc\ttraddr\x3dnn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2\ttrsvcid\x3dnone\t-host-traddr\x3dnn-0x20000090fad17933:pn-0x10000090fad17933.service: Deactivated successfully.
      Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@-device\x3dnone\ttransport\x3dfc\ttraddr\x3dnn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3\ttrsvcid\x3dnone\t-host-traddr\x3dnn-0x20000090fad17933:pn-0x10000090fad17933.service: Deactivated successfully.
      Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@-device\x3dnone\ttransport\x3dfc\ttraddr\x3dnn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2\ttrsvcid\x3dnone\t-host-traddr\x3dnn-0x20000090fad17934:pn-0x10000090fad17934.service: Deactivated successfully.
      Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com systemd[1]: nvmf-connect@-device\x3dnone\ttransport\x3dfc\ttraddr\x3dnn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3\ttrsvcid\x3dnone\t-host-traddr\x3dnn-0x20000090fad17934:pn-0x10000090fad17934.service: Deactivated successfully.
      Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.1: Disconnect LS failed: No Association
      Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.1: Disconnect LS failed: No Association
      Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.0: Disconnect LS failed: No Association
      Mar 11 11:13:07 storageqe-28.sqe.lab.eng.bos.redhat.com kernel: lpfc 0000:04:00.0: Disconnect LS failed: No Association

      The connection to the nvme namespace appears to establish correctly:

      1. nvme list
        Node Generic SN Model Namespace Usage Format FW Rev
                                              • --------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
                                                /dev/nvme1n1 /dev/ng1n1 a79585895d0a700e Linux 1 1.60 TB / 1.60 TB 4 KiB + 0 B 5.14.0-2

      Below is my discovery.conf in case it is helpful:

      1. cat /etc/nvme/discovery.conf
        --transport=fc --traddr=nn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2 --host-traddr=nn-0x20000090fad17933:pn-0x10000090fad17933
        --transport=fc --traddr=nn-0x20000090fac7b6c2:pn-0x10000090fac7b6c2 --host-traddr=nn-0x20000090fad17934:pn-0x10000090fad17934
        --transport=fc --traddr=nn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3 --host-traddr=nn-0x20000090fad17933:pn-0x10000090fad17933
        --transport=fc --traddr=nn-0x20000090fac7b6c3:pn-0x10000090fac7b6c3 --host-traddr=nn-0x20000090fad17934:pn-0x10000090fad17934

      Version-Release number of selected component (if applicable):

      1. rpm -qa nvme-cli
        nvme-cli-2.2.1-2.el9.x86_64

      How reproducible: Often

      Steps to Reproduce:
      1. see above

      Additional info: Below is the job where the issue was first seen:

      https://beaker.engineering.redhat.com/jobs/7613610

            mlombard@redhat.com Maurizio Lombardi
            mpatalan Marco Patalano
            Maurizio Lombardi Maurizio Lombardi
            Marco Patalano Marco Patalano
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: