Uploaded image for project: 'Network Observability'
  1. Network Observability
  2. NETOBSERV-975

Flows dropped due to Loki stream limit during large traffic spikes

Details

    • False
    • None
    • False
    • Important

    Description

      In recent NetObserv 1.2 performance testing, we have witnessed the following behavior:

      • During spikes of load, certain eBPF pods will be OOMKilled and go into CrashLoopBackOff state. Notably these have been observed to be the same pods co-located on nodes with LokiStack resources, which have high memory usage
        • This behavior was observed with both small and medium sized LokiStacks as well as the default eBPF memory limit of 800Mi as well as an increased limit of 1000Mi
      • Flows continue to be processed but some are dropped during these spike periods.

      The operator recovers after the load spikes end with eBPF pods recovering and flows returning to being written.

      Opening this bug to track the behavior and gather more data.

      Discussions relating to this bug:

      Attachments

        1. image-2023-04-07-15-19-35-241.png
          30 kB
          Nathan Weinberg
        2. image-2023-04-07-15-20-00-557.png
          126 kB
          Nathan Weinberg
        3. image-2023-04-07-15-20-42-156.png
          229 kB
          Nathan Weinberg

        Issue Links

          Activity

            People

              jpinsonn@redhat.com Julien Pinsonneau
              nweinber1 Nathan Weinberg
              Nathan Weinberg Nathan Weinberg
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: