Uploaded image for project: 'Network Observability'
  1. Network Observability
  2. NETOBSERV-1298

eBPF deduper enhancement: merge mode

    • Icon: Story Story
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • eBPF
    • Improvement
    • False
    • None
    • False
    • OCPSTRAT-286 - Improve eBPF agent performance
    • NetObserv - Sprint 243, NetObserv - Sprint 244, NetObserv - Sprint 245

      Our eBPF agent has a deduplication mechanism that can work in two different ways:

      • mark flows seen as duplicate with a "Duplicate=true" field, so that it's easy to filter them out on querying. This is the behaviour in use with netobserv operator.
      • eliminate duplicates entirely (not sending them to FLP)

      In the operator, we choosed to use the first mechanism because there are sometimes valuable information we can get from duplicates: in particular, the "Interface" and "Direction" fields differ across duplicates.

      We could introduce a third dedup mode which is "merging duplicates":

      => flows found being duplicates are eliminated, however, we introduce a new field which is "TraversedInterfaces" (or simply "Interfaces" plural) that would hold an array of the interfaces w/ direction, such as: ["genev/0", "abc1234/1"]. The FlowDirection and Interface fields would then be removed.

      FLP and plugin should be updated accordingly.

       

      I can see 2 benefits from that:

      1. Reduce amount of flows in downstream pipeline and storage (more or less divide by 2) - hence also decreasing load for Loki & queries.
      2. Simplified queries

      But also some challenges:

      • The deduper cache in the agent is today independent from the aggregated flows map. That's because a duplicate flow might arrive to the deduper after its duplicate was already flushed out to FLP. In that situation, we cannot update the original flow since it's gone. We might find other tricks for that, e.g. creating a new flow but with all counters set to 0 ? Need to think about that...

      Important: when doing this implementation, we will need to measure the performance gains relative to reducing the quantity of flows. If there isn't any substantial gain as expected, there's probably no point merging that.

            mmahmoud@redhat.com Mohamed Mahmoud
            jtakvori Joel Takvorian
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: