Uploaded image for project: 'Knative Serving'
  1. Knative Serving
  2. SRVKS-45

[DOC] Autoscaler fails to scale up due to low queue size

XMLWordPrintable

    • 3
    • Serverless Sprint 165

      Running a performance tests with high number of clients reveals issues with autoscaler.
      Test:

      docker run fortio/fortio load -H "Host: observed-concurrencyjwocndoq.serving-tests.ex
      ample.com" -t 30s -c 50 -qps 0 http://35.185.77.187/?timeout=1000
      

      This test runs a test for 30 seconds with 50 clients sending requests as quickly as possible.

      The problem is slow scaling up because the autoscaler receives biased numbers for concurrency levels.
      Root cause: The queue depth is set to the same number as desired container concurrency level: link . When there are a lot of requests coming in the queue-proxy will start refusing requests with 503 error code. And it will start refusing them immediately when the number of pending requests exceeds queueDepth+maxConcurrency .
      The requests which result in 503 are very quick and the they increase the average concurrency for all requests very slowly even if there are a lot of 503 responses. As a result the autoscaler scales up very slowly, causing majority of requests to be refused.

      One solution that I tried (and got no 503 results back) would be to make the queueDepth relative to the concurrency level but essentially much longer than it is now, e.g.

       queueDepth := containerConcurrency * 100
      

      It would be also nice to adjust the queueDepth on the fly according to the number of requests that come in, even if they result in 503 error code.

              joaedwar@redhat.com Joan Edwards
              mgencur Martin Gencur
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: