Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-14172

SR-IOV operator cannot initialize MT27710 ConnectX-4 NIC with Unable to get device mode error message

    XMLWordPrintable

Details

    • -
    • Important
    • No
    • CNF Network Sprint 237
    • 1
    • False
    • Hide

      None

      Show
      None

    Description

      Description of problem:

      SR-IOV operator isn't initializing MT27710 ConnectX-4 NIC attached on few of their nodes.
      
      They see this in the config daemon logs
      
      DiscoverSriovDevices(): unable to get device mode 0000:01:00.0 "no such device"

      Version-Release number of selected component (if applicable):

      SR-IOV on OCP v4.12

      How reproducible:

      Occuring on customer environment

      Steps to Reproduce:

      1.
      2.
      3.
      

      Actual results:

      NIC can be seen attached but no VFs are up
      
      03:00.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]
          Subsystem: Mellanox Technologies Device [15b3:0009]

      Expected results:

      VFs should be up as defined in SriovNetworkNodePolicy object

      Additional info:

      On thing suspicious here is that I can't see SR-IOV capability for this NIC as seen for Intel one. They have turned required settings on from UEFI/BIOS
      
      $ egrep -E "SR-IOV|Ether" lspci
      01:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
      	Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
      01:00.1 Ethernet controller [0200]: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 [8086:1528] (rev 01)
      	Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
      03:00.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]
      03:00.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]
      82:00.0 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]
      82:00.1 Ethernet controller [0200]: Mellanox Technologies MT27710 Family [ConnectX-4 Lx] [15b3:1015]

      Attachments

        Activity

          People

            sscheink@redhat.com Sebastian Scheinkman
            rhn-support-adubey Akash Dubey
            Zhanqi Zhao Zhanqi Zhao
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: