Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-1567

Many Mac_Binding related transaction errors when using IPv6

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • ovn25.03
    • None
    • Many Mac_Binding related transaction errors when using IPv6
    • Hide

      Please mark each item below with ( / ) if completed or ( x ) if incomplete:

      ( ) The acceptance criteria defined below are met.

      Given an OVN deployment with an HA DGP and multiple chassis and IPv6 ND_NA packets are received on a logical router port,

      When the MAC_Binding table is updated in response to ND_NA,

      Then only the active HA chassis must attempt the update and all standby chassis must skip the transaction.


      ( ) The epics work is available in a downstream build (nightly/Async or other)


      ( ) All cards under the epic have been moved to Done

      Show
      Please mark each item below with ( / ) if completed or ( x ) if incomplete: ( ) The acceptance criteria defined below are met. Given an OVN deployment with an HA DGP and multiple chassis and IPv6 ND_NA packets are received on a logical router port, When the MAC_Binding table is updated in response to ND_NA, Then only the active HA chassis must attempt the update and all standby chassis must skip the transaction. ( ) The epics work is available in a downstream build (nightly/Async or other) ( ) All cards under the epic have been moved to Done
    • In Progress
    • ovn25.03-25.03.1-93.el9fdp
    • rhel-9
    • rhel-net-ovn
    • 0% To Do, 18% In Progress, 82% Done
    • ssg_networking
    • OVN FDP Sprint 8, OVN FDP Sprint 9
    • 2

      This epic tracks all the effort needed to deliver the solution related to the bug described below.

       Problem Description: Clearly explain the issue.

      When using Distributed Gateway Port and HA, IPv6 ND_NA packet
      received on the router port causes all HA chassis to try updating the
      Mac_Binding table, often causing transaction errors.

      Reception of ARP packets do not cause the same issue.

       Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).

      Transaction errors causes full recomputes, hence high CPU utilization.

       Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).

      Reproduced on main.

        Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).

      Not a regression.

       Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.

      Not all ND_NA received on the router ports cause a transaction error (thanks to MAX_MAC_BINDING_DELAY_MSEC). 

      However, sending 50 ND_NA packets to 3 HA chassis I see the problems most of the times.

       Reproduction Steps: Provide detailed steps or scripts to replicate the issue.

      Send many ND_NA from outside to the router.

       Expected Behavior: Describe what should happen under normal circumstances.

      Only the active HA chassis should update sb, hence no transaction errors.

       Observed Behavior: Explain what actually happens.

      Reception of ND_NA packets causes all HA chassis to try to update the same mac_binding in sb, hence increasing the risk of transaction failures.

       Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.

       Logs: If you collected logs please provide them (e.g. sos report, /var/log/openvswitch/* , testpmd console)

              xsimonar@redhat.com Xavier Simonart
              xsimonar@redhat.com Xavier Simonart
              Jianlin Shi Jianlin Shi
              OVN
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: