Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-2376

Test Plan: RFE: Add more debug logs for lacp

    • False
    • Hide

      None

      Show
      None
    • False
    • Hide

      ( ) The new test plan is aligned with the epic's acceptance criteria

      Given an OVS LACP bond is configured, 

      When LACP debug logging is enabled and a link failover occurs, 

      Then, the debug logs should include detailed information about the LACP state machine's behavior and the logs should provide sufficient information to troubleshoot common LACP issues without requiring additional diagnostic steps.

      ( ) The test plan/test case pass successfully on all non blocking functions of the feature

      Show
      ( ) The new test plan is aligned with the epic's acceptance criteria Given an OVS LACP bond is configured,  When LACP debug logging is enabled and a link failover occurs,  Then, the debug logs should include detailed information about the LACP state machine's behavior and the logs should provide sufficient information to troubleshoot common LACP issues without requiring additional diagnostic steps. ( ) The test plan/test case pass successfully on all non blocking functions of the feature
    • None
    • rhel-net-ovs-dpdk

      This task is tracking the test case writing activities to cover the feature request described below.

      Original bugzilla ticket:
      Description of problem:

      while troubleshooting a ovs (kernel, no dpdk) lacp bond issue, i enabled below 2 debuggers.

      [root@computesriov-0 openvswitch]# ovs-appctl vlog/list
      console syslog file
      ------- ------ ------
      bond OFF ERR DBG
      lacp OFF ERR DBG

      I performed link failure at uplink switch.

      When both member interfaces were up.

      2023-07-17T10:45:18.232Z|06252|bond|DBG|bond lacp-bond: enp4s0f0np0 0kB, enp4s0f1np1 0kB
      2023-07-17T10:45:28.241Z|06256|bond|DBG|bond lacp-bond: enp4s0f0np0 0kB, enp4s0f1np1 0kB

      Brought down 1 member interface at uplink switch.

      2023-07-17T10:45:31.672Z|06257|bond|INFO|member enp4s0f0np0: link state down
      2023-07-17T10:45:31.672Z|06258|bond|INFO|member enp4s0f0np0: disabled
      2023-07-17T10:45:31.672Z|06259|bond|INFO|bond lacp-bond: active member is now enp4s0f1np1
      2023-07-17T10:45:31.673Z|08614|bond(revalidator7)|DBG|bond lacp-bond: member enp4s0f0np0: main thread has not yet enabled member
      2023-07-17T10:45:31.679Z|08615|bond(revalidator7)|DBG|bond lacp-bond: member enp4s0f0np0: admissibility verdict is to drop pkt, active member: false, may_enable: false, enabled: false, LACP status: negotiated
      2023-07-17T10:45:38.686Z|06260|bond|DBG|bond lacp-bond: enp4s0f1np1 0kB
      2023-07-17T10:45:48.696Z|06261|bond|DBG|bond lacp-bond: enp4s0f1np1 0kB

      Brought down 2nd member interface.

      2023-07-17T10:45:53.835Z|06262|bond|INFO|member enp4s0f1np1: link state down
      2023-07-17T10:45:53.835Z|06263|bond|INFO|member enp4s0f1np1: disabled
      2023-07-17T10:45:53.835Z|06264|bond|INFO|bond lacp-bond: all members disabled

      Brought up 1st member interface.

      2023-07-17T10:46:28.543Z|06271|bond|INFO|member enp4s0f0np0: link state up
      2023-07-17T10:46:28.543Z|06272|bond|INFO|member enp4s0f0np0: enabled
      2023-07-17T10:46:28.543Z|06273|bond|INFO|bond lacp-bond: active member is now enp4s0f0np0
      2023-07-17T10:46:36.065Z|06274|bond|DBG|bond lacp-bond: enp4s0f0np0 0kB
      2023-07-17T10:46:46.075Z|06275|bond|DBG|bond lacp-bond: enp4s0f0np0 0kB

      Brought up 2nd member interface.

      2023-07-17T10:46:53.055Z|06276|bond|INFO|member enp4s0f1np1: link state up
      2023-07-17T10:46:53.055Z|06277|bond|INFO|member enp4s0f1np1: enabled
      2023-07-17T10:46:56.559Z|06278|bond|DBG|bond lacp-bond: enp4s0f0np0 0kB, enp4s0f1np1 0kB

      LACP re-negotiated successfully.

      [root@computesriov-0 tripleo-admin]# ovs-appctl lacp/show
      ---- lacp-bond ----
      status: active negotiated
      sys_id: 04:3f:72:d9:c0:48
      sys_priority: 65534
      aggregation key: 1
      lacp_time: fast

      member: enp4s0f0np0: current attached
      port_id: 2
      port_priority: 65535
      may_enable: true

      actor sys_id: 04:3f:72:d9:c0:48
      actor sys_priority: 65534
      actor port_id: 2
      actor port_priority: 65535
      actor key: 1
      actor state: activity timeout aggregation synchronized collecting distributing

      partner sys_id: c8:fe:6a:f2:44:00
      partner sys_priority: 127
      partner port_id: 5
      partner port_priority: 127
      partner key: 5
      partner state: activity timeout aggregation synchronized collecting distributing

      member: enp4s0f1np1: current attached
      port_id: 1
      port_priority: 65535
      may_enable: true

      actor sys_id: 04:3f:72:d9:c0:48
      actor sys_priority: 65534
      actor port_id: 1
      actor port_priority: 65535
      actor key: 1
      actor state: activity timeout aggregation synchronized collecting distributing

      partner sys_id: c8:fe:6a:f2:44:00
      partner sys_priority: 127
      partner port_id: 6
      partner port_priority: 127
      partner key: 5
      partner state: activity timeout aggregation synchronized collecting distributing
      [root@computesriov-0 tripleo-admin]#
      [root@computesriov-0 tripleo-admin]#

      [root@computesriov-0 tripleo-admin]# ovs-appctl bond/show
      ---- lacp-bond ----
      bond_mode: balance-slb
      bond may use recirculation: no, Recirc-ID : -1
      bond-hash-basis: 0
      lb_output action: disabled, bond-id: -1
      all members active: false
      updelay: 0 ms
      downdelay: 0 ms
      next rebalance: 9098 ms
      lacp_status: negotiated
      lacp_fallback_ab: true
      active-backup primary: <none>
      active member mac: 04:3f:72:d9:c0:48(enp4s0f0np0)

      member enp4s0f0np0: enabled
      active member
      may_enable: true

      member enp4s0f1np1: enabled
      may_enable: true

      [root@computesriov-0 tripleo-admin]#

      I expect "lacp" debugger should have more debugs enabled to understand what is going with lacp state machine.

      Version-Release number of selected component (if applicable):
      openvswitch3.0-3.0.0-28.el9fdp.x86_64

      How reproducible:
      100%

      Steps to Reproduce:
      1. Configure ovs lacp bond
      2. Perform link fail over
      3.

      Actual results:
      No logs to suggest what is going on with lacp sync

      Expected results:
      should have more logs to help in troubleshoot

      Additional info:
      I have performed this with ovs kernel datapath, however same would be true for ovs-dpdk datapath as well.

              ovsdpdk-triage ovsdpdk triage
              nstbot NST Bot
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: