Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-1730

[EVPN] ovn-controller doesn't install OF rules for static remote FDB entries if multiple VNIs/switches are handled locally

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • ovn25.09
    • None
    • 5
    • False
    • Hide

      None

      Show
      None
    • False
    • Hide

      Given static remote FDB entries for the same MAC on two VNIs handled by one chassis,

      When ovn-controller translates EVPN FDB state to OpenFlow,

      Then it installs distinct per-VNI OpenFlow rules for each static FDB entry and traffic to that MAC is forwarded without flooding on all VNIs.

      Show
      Given static remote FDB entries for the same MAC on two VNIs handled by one chassis, When ovn-controller translates EVPN FDB state to OpenFlow, Then it installs distinct per-VNI OpenFlow rules for each static FDB entry and traffic to that MAC is forwarded without flooding on all VNIs.
    • ovn25.09-25.09.0-31.el9fdp
    • rhel-9
    • None
    • rhel-net-ovn
    • ssg_networking
    • OVN FDP Sprint 10
    • 1
    • Critical

       Problem Description: Clearly explain the issue.

      This was spotted during code inspection but it should be easy to replicate with a setup like the following:

      
      L2-domain1 (VNI 1)           |       L2-domain2 (VNI 2)
                                   |
          Workload1 (MAC-X)        |        Workload2 (MAC-X)
                                   |
      (EVPN Fabric)                |
      ----------------------------------------------------------------------------
      (OVN)                        |
                                   |
      LogicalSwitch LS1 (VNI 1)    |       LogicalSwitch LS2 (VNI 2)
                                   |
                                   |
      

      The MAC-X mac address is used by both (independent) fabric workloads. ovn-controller should install openflow rules to forward traffic destined to MAC-X for both logical switches (for both VNIs).

      It currently only installs only one rule, for the "last" version of the EVPN remote MAC it learned (ignoring VNI).

      That's due to this part of the code:
      https://github.com/ovn-org/ovn/blob/1059c46ffe06d144878947445ecf65ec57c7a004/controller/evpn-fdb.c#L57-L71

      void
      evpn_fdb_run(const struct evpn_fdb_ctx_in *f_ctx_in,
                   struct evpn_fdb_ctx_out *f_ctx_out)
      {
      [...]
          const struct evpn_static_fdb *static_fdb;
          HMAP_FOR_EACH (static_fdb, hmap_node, f_ctx_in->static_fdbs) {
              const struct evpn_binding *binding =
                  evpn_binding_find(f_ctx_in->bindings, &static_fdb->ip,
                                    static_fdb->vni);
              if (!binding) {
                  VLOG_WARN_RL(&rl, "Couldn't find EVPN binding for "ETH_ADDR_FMT" "
                               "MAC address.", ETH_ADDR_ARGS(static_fdb->mac));
                  continue;
              }
      
              fdb = evpn_fdb_find(f_ctx_out->fdbs, static_fdb->mac);
              if (!fdb) {
                  fdb = evpn_fdb_add(f_ctx_out->fdbs, static_fdb->mac);
              }
      
              bool updated = false;
              if (fdb->binding_key != binding->binding_key) {
                  fdb->binding_key = binding->binding_key;
                  updated = true;
              }
      
              if (fdb->dp_key != binding->dp_key) {
                  fdb->dp_key = binding->dp_key;
                  updated = true;
              }
      

      evpn_static_fdb objects are differentiated by VNI but the evpn_fdb addition/lookup ignores the VNI. The latter structures are used for generating the EVPN FDB openflow forwarding rules.

      When fixing this bug we should also enhance the current system and multinode EVPN tests to cover this scenario.

       Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).

      This will cause unnecessary L2 flooding of unknown unicast frames for N-1 L2 EVPN domains if N > 1, severely affecting network performance.
       

       Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).

      Any ovn25.09.
       

        Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).

      Day-one issue.
       

       Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.

      Didn't test (code inspection) but it should be consistently reproducible.
       

       Reproduction Steps: Provide detailed steps or scripts to replicate the issue.

      See description above.
       

       Expected Behavior: Describe what should happen under normal circumstances.

      Remote MACs should be handled properly for all VNIs even if the same MAC address is reused.
       

       Observed Behavior: Explain what actually happens.

       

       Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.

       

       Logs: If you collected logs please provide them (e.g. sos report, /var/log/openvswitch/* , testpmd console)

              amusil@redhat.com Ales Musil
              dceara@redhat.com Dumitru Ceara
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: