Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-399

ovn-controller hangs with a lot of meters

XMLWordPrintable

    • False
    • Hide

      None

      Show
      None
    • False

      running ovn from branch-23.09, with a good amount of ACL, ending up with a big amount of meter objects:

      # ovn-sbctl list meter | grep -c _uuid
      90017
      # ovn-sbctl list meter_band | grep -c _uuid
      90017
      # ovn-sbctl list address_set | grep -c _uuid
      33038
      # ovn-sbctl list port_group | grep -c _uuid
      816
      

      it takes a long time (10s) for external_ids:ovn-installed=true to be added on an interface when no other operation is done in parallel, and ovn-controller is taking 100% CPU during that time.

      CPU is caused by ofctrl_meter_bands_sync method complexity, which itself is called in a loop:
      https://github.com/ovn-org/ovn/blob/68acb363cad9932f3cec14bc402c39bd343d024d/controller/ofctrl.c#L2739
      and which loops on every item of the meter table, doing a strcmp everytime to locate the right item.

      "perf top" at the time:

        73.77%  libc.so.6       [.] __strcmp_avx2
        18.31%  ovn-controller  [.] ovsdb_idl_next_row
         6.63%  ovn-controller  [.] ofctrl_meter_bands_sync
      

      taking a few core dump during the high CPU usage, shows ovn-controller stuck in ofctrl_meter_bands_sync as well.

      Commenting the line 2739 above, and the ovn-installed=true annotation is now added in 0.05s.

      This is reproduced with ovn23.09-23.09.0-91.el9fdp.x86_64 and branch-23.09
      the problem was initially seen on ovn-kubernetes and is reported in https://access.redhat.com/support/cases/#/case/03746492

      due to the usage of egressfirewall, address_set are being changed every few seconds, so ovn-controller is seen always taking CPU on the node.

              imaximet@redhat.com Ilya Maximets
              frigault Francois Rigault
              Ehsan Elahi Ehsan Elahi
              Votes:
              1 Vote for this issue
              Watchers:
              11 Start watching this issue

                Created:
                Updated:
                Resolved: