Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-509

ovn-controller slow update of set of address_sets

XMLWordPrintable

    • 8
    • False
    • Hide

      None

      Show
      None
    • False
    • FDP 24.D, FDP 24.E
    • Customer Escalated

      when processing large address_sets referenced in ACL in such fashion

      match               : "ip4.src == {$a6538357531468486505, $a7890555686633816021} && outport == @a11713828714540761946"
      

      (with multiple address_set, here with a6538357531468486505 containing 3754 addresses, a7890555686633816021 containing 2 addresses),
      when updating the big address_set, ovn-controller hangs and logs

      2024-03-13T21:21:40.606Z|6040765|inc_proc_eng|INFO|node: logical_flow_output, handler for input addr_sets took 1563ms
      2024-03-13T21:21:40.853Z|6040766|timeval|WARN|Unreasonably long 1816ms poll interval (1795ms user, 3ms system)
      2024-03-13T21:21:40.853Z|6040767|timeval|WARN|faults: 672 minor, 0 major
      2024-03-13T21:21:40.853Z|6040768|timeval|WARN|context switches: 0 voluntary, 41 involuntary
      

      in practice on a big ovn-kubernetes deployment with networkpolicies using labels that select a lot of pods, ovn-controller is always busy with 100% CPU.

      This only seems to happen when using a set of address_sets, just using ip4.src == {$a6538357531468486505} with a very big address_set, seems way faster.

      A reproducer:

       

      # rpm -q --whatprovides /usr/bin/ovn-controller
      ovn23.09-host-23.09.0-91.el9fdp.x86_64
      

       

      #!/bin/bash
      # a large list of addresses
      echo -n "[" > a
      for i in `seq 70` ; do
      for j in `seq 70` ; do
      echo -n "\"10.0.${i}.${j}\", " >> a
      done
      done
      echo -n " \"10.0.255.255\"]" >> a
      sed -e 's/.10.0.42.42..//' a > b
      
      set -x
      systemctl stop openvswitch
      systemctl stop ovn-controller
      systemctl stop ovn-northd
      
      rm /etc/openvswitch/conf.db
      rm /var/lib/ovn/ovnsb_db.db
      rm /var/lib/ovn/ovnnb_db.db
      
      ip netns del pods
      
      set -e
      
      systemctl start ovn-northd
      ovn-sbctl --inactivity-probe=0 set-connection ptcp:6642
      ovn-nbctl --wait=hv sync
      
      ovn-nbctl ls-add ls1 \
            -- create address_set name=address addresses='[]' \
            -- create address_set name=address2 addresses='["1.2.3.4", "5.6.7.8"]'
      
      for i in `seq 10 99` ; do
      ovn-nbctl lsp-add ls1 ls1p$i \
            -- pg-add pg$i ls1p$i \
            -- acl-add pg$i to-lport 1000 "ip4.src=={\$address, \$address2} && outport == @pg${i}" drop \
            -- lsp-set-addresses ls1p${i} "00:00:00:01:01:${i} 192.168.1.${i}" \
            -- set logical_switch_port ls1p${i} options:requested-chassis=$(cat /etc/openvswitch/system-id.conf)
      done
      
      systemctl start openvswitch
      systemctl start ovn-controller
      ovs-vsctl set open .  external_ids:ovn-remote=tcp:127.0.0.1:6642 \
                            external_ids:ovn-encap-type=geneve \
                            external_ids:ovn-encap-ip=10.224.123.2 \
                            external_ids:ovn-openflow-probe-interval=240
      
      node=ls1
      ovn-sbctl wait-until chassis `cat /etc/openvswitch/system-id.conf`
      ovs-vsctl wait-until port br-int
      ovn-nbctl --wait=hv sync
      
      ip netns add pods
      for i in `seq 10 99`; do
      ovs-vsctl  add-port br-int  pod$i -- set interface pod$i type=internal -- set interface pod$i external_ids:iface-id=ls1p$i
      A=$(ovn-nbctl --bare --columns addresses list logical_switch_port ls1p${i})
      ip link set pod$i netns pods
      mac=${A% *}
      ip=${A#* }
      ip netns exec pods ip link set pod$i address ${mac}
      ip netns exec pods ip link set pod$i up
      ip netns exec pods ip addr add ${ip} dev pod$i
      done
      time ovs-vsctl wait-until interface pod99 external_ids:ovn-installed=true
      
      
      for i in `seq 15` ; do
      ovn-nbctl set address_set address "addresses=`cat a`" && ovn-nbctl --wait=hv sync
      ovn-nbctl set address_set address "addresses=`cat b`" && time ovn-nbctl --wait=hv sync
      done
      

      expected result:
      update of address_sets are instant

      actual result:
      slow and those logs in ovn-controller.log:

      2024-03-14T04:46:14.546Z|00530|inc_proc_eng|INFO|node: logical_flow_output, handler for input addr_sets took 896ms
      2024-03-14T04:46:14.789Z|00531|timeval|WARN|Unreasonably long 1142ms poll interval (1133ms user, 2ms system)
      2024-03-14T04:46:14.789Z|00532|timeval|WARN|disk: 0 reads, 8 writes
      2024-03-14T04:46:14.789Z|00533|timeval|WARN|context switches: 0 voluntary, 2 involuntary
      2024-03-14T04:46:15.845Z|00534|memory_trim|INFO|Detected inactivity (last active 30065 ms ago): trimming memory
      2024-03-14T04:46:15.857Z|00535|timeval|WARN|Unreasonably long 1030ms poll interval (1013ms user, 11ms system)
      2024-03-14T04:46:15.857Z|00536|timeval|WARN|context switches: 0 voluntary, 2 involuntary
      2024-03-14T04:46:16.782Z|00537|inc_proc_eng|INFO|Dropped 1 log messages in last 1 seconds (most recently, 1 seconds ago) due to excessive rate
      2024-03-14T04:46:16.782Z|00538|inc_proc_eng|INFO|node: logical_flow_output, handler for input addr_sets took 901ms
      2024-03-14T04:46:17.012Z|00539|timeval|WARN|Unreasonably long 1133ms poll interval (1081ms user, 44ms system)
      

      Discovered while investigation CNI issue with ovn-kubernetes in case 03746492

              amusil@redhat.com Ales Musil
              frigault Francois Rigault
              Ehsan Elahi Ehsan Elahi
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated:
                Resolved: