Uploaded image for project: 'Network Observability'
  1. Network Observability
  2. NETOBSERV-1458

NetObserv drops flows due to Loki 429 Too Many Requests

    • False
    • None
    • False
    • NetObserv - Sprint 248
    • Important

      NOTE: This is a different 429 error than the one described in NETOBSERV-975

      Description of problem:

      When running our large-scale PerfScale scenario with NetObserv 1.5, we are seeing a large number of dropped flows due to a Loki 429 error

      Steps to Reproduce:

      1. Deploy an OCP4.14 cluster and scale to 120 nodes
      2. Install NetObserv 1.5, Loki Operator with a 1x.medium LokiStack, and AMQ Streams Operator
      3. Run the cluster-density-v2 workload with a variable of 480
      

      Actual results:

      Flows are dropped due to the following error (seen on various FLP pods)
      
      time=2024-01-23T19:31:21Z level=info component=client error=server returned HTTP status 429 Too Many Requests (429): Maximum active stream limit exceeded, reduce the number of active streams (reduce labels or reduce label values), or contact your Loki administrator to see if the limit can be increased, user: 'network' fields.level=warn fields.msg=error sending batch, will retry host=lokistack-gateway-http.netobserv.svc:8080 module=export/loki status=429

      Expected results:

      No flows should be dropped

      Additional Info:

      This was seen in performance runs 0dc5303c-301d-4d1a-8c4c-0d7ef100b5dc and 911c279c-5c58-49b9-82ac-a61508262c44 - the env details from the latter as well as a must-gather are below/attached - additional data from those runs can be found here

      OCP: 4.14.0-0.nightly-2024-01-18-061723
      NetObserv operator: v1.5.0
      Loki: v5.8.2
      eBPF-agent: v1.5.0-76
      FLP: v1.5.0-76
      ConsolePlugin: v1.5.0-76 

      must-gather: https://drive.google.com/file/d/1kTxe4dElC_FJ5ipL_QINngNaNag_IuRU/view?usp=drive_link

            jtakvori Joel Takvorian
            nweinber1 Nathan Weinberg
            Nathan Weinberg Nathan Weinberg
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: