Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-74335

received routes are taking longer than expected to be pushed to the kernel in a BGP failover scenario, this causes traffic impact on 5G resiliency testing

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Major Major
    • None
    • 4.20
    • Networking / FRR-K8s
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • None
    • x86_64
    • QA
    • None
    • None
    • None
    • CNF Network Sprint 285
    • 1
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      trying to replace calico with OVN/frr for a telco customer.  we send and receive routes successfully, we support the network functions successfully, but when we do resiliency testing, specifically the bond failover/BGP failover, we have traffic impact.. it appears the received routes take an additional 5 seconds to be pushed to the kernel by Zebra.
      
      This test fails with OVN/frr but passes with calico

      Version-Release number of selected component (if applicable):

          4.20.10 with additionalRoutingCapabilities enabled

      How reproducible:

          100%

      Steps to Reproduce:

          1. set up a secondary bond with active/passive (mode 1)
          2. set BGP up with FRRConfiguration (BGP is active/passive too)
          3. set up any kind of traffic
          4. flip the bond with "ip l set dev xxx down"
          5. failover is fast (depending on your timers)
          6. routes are exchanged
          7. 5 seconds delay
          8. routes are finally pushed to the kernel routing table
          9. End-of-RIB message appears in the logs of FRR
          

       

      Actual results:

          with an active/passive BGP setup, it is important that the routes are pushed to the kernel routing table fast, calico/bird seems to be able to push these routes under 1 second, but FRR/zebra adds a 5 seconds delay

      Expected results:

          zebra should not delay pushing the new routes to the kernel table

      Additional info:

          

              fpaoline@redhat.com Federico Paolinelli
              rhn-gps-bdeschen Boris Deschenes
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: