Uploaded image for project: 'Network Observability'
  1. Network Observability
  2. NETOBSERV-975

Flows dropped due to Loki stream limit during large traffic spikes

    • False
    • None
    • False
    • Important

      In recent NetObserv 1.2 performance testing, we have witnessed the following behavior:

      • During spikes of load, certain eBPF pods will be OOMKilled and go into CrashLoopBackOff state. Notably these have been observed to be the same pods co-located on nodes with LokiStack resources, which have high memory usage
        • This behavior was observed with both small and medium sized LokiStacks as well as the default eBPF memory limit of 800Mi as well as an increased limit of 1000Mi
      • Flows continue to be processed but some are dropped during these spike periods.

      The operator recovers after the load spikes end with eBPF pods recovering and flows returning to being written.

      Opening this bug to track the behavior and gather more data.

      Discussions relating to this bug:

            jpinsonn@redhat.com Julien Pinsonneau
            nweinber1 Nathan Weinberg
            Nathan Weinberg Nathan Weinberg
            0 Vote for this issue
            6 Start watching this issue