Uploaded image for project: 'RHEL'
  1. RHEL
  2. RHEL-55291

[RHEL 9.5] NetworkManager-dispatcher.service not spawned upon link down/up event

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • rhel-9.5
    • nvme-cli
    • nvme-cli-2.9.1-6.el9
    • No
    • Important
    • sst_storage_io
    • ssg_filesystems_storage_and_HA
    • 3
    • Dev ack
    • False
    • Hide

      None

      Show
      None
    • None
    • Red Hat Enterprise Linux
    • None
    • Pass
    • None
    • x86_64
    • None

      NetworkManager-1.48.6-1.el9
      kernel-5.14.0-493.el9

      0d:00.0 Ethernet controller: Intel Corporation Ethernet Controller E810-C for QSFP (rev 02)
      0d:00.1 Ethernet controller: Intel Corporation Ethernet Controller E810-C for QSFP (rev 02)
      

      We have placed our custom hook in /etc/NetworkManager/dispatcher.d/ and while it's run on other deployments with other NICs, it doesn't work with this 100Gbps Intel NIC.

      Kernel ice driver appears to be registering link events just fine:

      Aug 20 09:45:42 localhost.localdomain kernel: ice 0000:0d:00.0 nbft0: NIC Link is up 100 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: RS-FEC, Autoneg Advertised: Off, Autoneg Negotiated: False, Flow Control: None
      Aug 20 09:54:47 rhel-storage-108.fast.eng.rdu2.dc.redhat.com kernel: ice 0000:0d:00.1 nbft1: NIC Link is up 100 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: RS-FEC, Autoneg Advertised: Off, Autoneg Negotiated: False, Flow Control: None
      

      NM also registers link events, however any dispatcher stuff is not called:

      Aug 20 09:52:15 rhel-storage-108.fast.eng.rdu2.dc.redhat.com kernel: ice 0000:0d:00.1 nbft1: NIC Link is Down
      Aug 20 09:52:15 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <trace> [1724161935.9750] platform-linux: event-notification: RTM_NEWLINK, flags 0, seq 0: 7: nbft1 <UP;broadcast,multicast,up> mtu 1500 arp 1 ethernet? not-init addrgenmode none addr 40:A6:B7:C0:8A:C9 permaddr 40:A6:B7:C0:8A:C9 brd FF:FF:FF:FF:FF:FF tx-queue-len 1000 gso-max-size 65536 gso-max-segs 65535 gro-max-size 65536 rx:68138,100349838 tx:10895,7148466
      Aug 20 09:52:15 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <debug> [1724161935.9751] platform: (nbft1) signal: link changed: 7: nbft1 <UP;broadcast,multicast,up> mtu 1500 arp 1 ethernet? init addrgenmode none addr 40:A6:B7:C0:8A:C9 permaddr 40:A6:B7:C0:8A:C9 brd FF:FF:FF:FF:FF:FF driver ice tx-queue-len 1000 gso-max-size 65536 gso-max-segs 65535 gro-max-size 65536 rx:68138,100349838 tx:10895,7148466
      Aug 20 09:52:15 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <trace> [1724161935.9751] l3cfg[0ef7a3d0b0516dff,ifindex=7]: emit signal (platform-change, obj-type=link, change=changed, obj=7: nbft1 <UP;broadcast,multicast,up> mtu 1500 arp 1 ethernet? init addrgenmode none addr 40:A6:B7:C0:8A:C9 permaddr 40:A6:B7:C0:8A:C9 brd FF:FF:FF:FF:FF:FF driver ice tx-queue-len 1000 gso-max-size 65536 gso-max-segs 65535 gro-max-size 65536 rx:68138,100349838 tx:10895,7148466)
      Aug 20 09:52:15 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <debug> [1724161935.9751] device[32f331c7eb9e55f2] (nbft1): queued link change for ifindex 7
      ...
      Aug 20 09:52:15 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <debug> [1724161935.9755] device[32f331c7eb9e55f2] (nbft1): carrier: link disconnected (deferring action for 6000 milliseconds)
      Aug 20 09:52:21 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <debug> [1724161941.9765] device[32f331c7eb9e55f2] (nbft1): carrier: link disconnected (calling deferred action)
      

      (and that's all for nbft1 events, the system continues running with other stuff)

      Link up:

      Aug 20 09:54:47 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <trace> [1724162087.6673] platform-linux: event-notification: RTM_NEWLINK, flags 0, seq 0: 7: nbft1 <UP,LOWER_UP;broadcast,multicast,up,running,lowerup> mtu 1500 arp 1 ethernet? not-init addrgenmode none addr 40:A6:B7:C0:8A:C9 permaddr 40:A6:B7:C0:8A:C9 brd FF:FF:FF:FF:FF:FF tx-queue-len 1000 gso-max-size 65536 gso-max-segs 65535 gro-max-size 65536 rx:68138,100349838 tx:10895,7148466
      Aug 20 09:54:47 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <debug> [1724162087.6673] platform: (nbft1) signal: link changed: 7: nbft1 <UP,LOWER_UP;broadcast,multicast,up,running,lowerup> mtu 1500 arp 1 ethernet? init addrgenmode none addr 40:A6:B7:C0:8A:C9 permaddr 40:A6:B7:C0:8A:C9 brd FF:FF:FF:FF:FF:FF driver ice tx-queue-len 1000 gso-max-size 65536 gso-max-segs 65535 gro-max-size 65536 rx:68138,100349838 tx:10895,7148466
      Aug 20 09:54:47 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <trace> [1724162087.6673] l3cfg[0ef7a3d0b0516dff,ifindex=7]: emit signal (platform-change, obj-type=link, change=changed, obj=7: nbft1 <UP,LOWER_UP;broadcast,multicast,up,running,lowerup> mtu 1500 arp 1 ethernet? init addrgenmode none addr 40:A6:B7:C0:8A:C9 permaddr 40:A6:B7:C0:8A:C9 brd FF:FF:FF:FF:FF:FF driver ice tx-queue-len 1000 gso-max-size 65536 gso-max-segs 65535 gro-max-size 65536 rx:68138,100349838 tx:10895,7148466)
      Aug 20 09:54:47 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <debug> [1724162087.6674] device[32f331c7eb9e55f2] (nbft1): queued link change for ifindex 7
      Aug 20 09:54:47 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <trace> [1724162087.6674] l3cfg[0ef7a3d0b0516dff,ifindex=7]: emit signal (platform-change-on-idle, obj-type-flags=0x2)
      Aug 20 09:54:47 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <info>  [1724162087.6675] device (nbft1): carrier: link connected
      Aug 20 09:54:47 rhel-storage-108.fast.eng.rdu2.dc.redhat.com NetworkManager[2863]: <trace> [1724162087.6679] ethtool[7]: ETHTOOL_GSET, nbft1: success
      Aug 20 09:54:47 rhel-storage-108.fast.eng.rdu2.dc.redhat.com kernel: ice 0000:0d:00.1 nbft1: NIC Link is up 100 Gbps Full Duplex, Requested FEC: RS-FEC, Negotiated FEC: RS-FEC, Autoneg Advertised: Off, Autoneg Negotiated: False, Flow Control: None
      

      This is an installation using NVMe over TCP (NBFT boot) for rootfs. Networking for nbft0 and nbft1 interfaces is set up by dracut using supplied IPv4 args as parsed by the dracut nvmf module. There's no .nmconnection file for these interfaces in initramfs or on the target rootfs. I have tried to create one using nmtui (i.e. explicitly marking the interface as 'managed') to no observable effect.

      # nmcli d
      DEVICE    TYPE      STATE                   CONNECTION 
      eno12399  ethernet  connected               eno12399   
      nbft0     ethernet  connected               nbft0      
      nbft1     ethernet  connected               nbft1      
      lo        loopback  connected (externally)  lo         
      eno12409  ethernet  disconnected            --         
      eno12419  ethernet  disconnected            --         
      eno12429  ethernet  disconnected            --         
      

      This exact same deployment works fine in qemu and a different Dell PowerEdge R660 system with Broadcom 25G NIC. It is this 100G Intel NIC that causes troubles. Autonegotiation is off but I doubt it makes any difference.

      Full logs attached.

            tbzatek Tomáš Bžatek
            tbzatek Tomáš Bžatek
            Tomáš Bžatek Tomáš Bžatek
            Marco Patalano Marco Patalano
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

              Created:
              Updated: