Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-481

Revisit revalidator flow-size reduction algorithm

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • openvswitch3.3
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • rhel-net-ovs-dpdk
    • ssg_networking

      Currently, the revalidator has the following logic:

       

      duration = MAX(time_msec() - start_time, 1);
      if (duration > 2000) {
          flow_limit /= duration / 1000;
      } else if (duration > 1300) {
         flow_limit = flow_limit * 3 / 4;
      } else if (duration < 1000 &&
         flow_limit < n_flows * 1000 / duration) {
         flow_limit += 1000;
      }

       

      The goal of this mechanism is to always guarantee that we apply changes to the datapath within a "reasonable time": 2 seconds.

      In an overloaded system, reducing the number of flows in the cache leads to flows being evicted, which can lead to higher number of upcalls which then leads to higher pressure on upcall handlers (that typically use the same cores as revalidators) and possible packet drops.

      This task is to try revisit this, test it under high pressure and see if we can make OVS more robust or at least find a good balance between revalidation time and upcalls.

      Since we're seeing deployments where ovs-vswitchd is being restricted to a small number of CPUs (e.g: PAO) this becomes more relevant.
      EDIT (amorenoz): affinity restriction is no longer a big issue since Openshift now implements OVS dynamic affinity configuration

       

              jmeng@redhat.com Jakob Meng
              amorenoz@redhat.com Adrian Moreno
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: