-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
3
-
Serverless Sprint 165
Running a performance tests with high number of clients reveals issues with autoscaler.
Test:
docker run fortio/fortio load -H "Host: observed-concurrencyjwocndoq.serving-tests.ex
ample.com" -t 30s -c 50 -qps 0 http://35.185.77.187/?timeout=1000
This test runs a test for 30 seconds with 50 clients sending requests as quickly as possible.
The problem is slow scaling up because the autoscaler receives biased numbers for concurrency levels.
Root cause: The queue depth is set to the same number as desired container concurrency level: link . When there are a lot of requests coming in the queue-proxy will start refusing requests with 503 error code. And it will start refusing them immediately when the number of pending requests exceeds queueDepth+maxConcurrency .
The requests which result in 503 are very quick and the they increase the average concurrency for all requests very slowly even if there are a lot of 503 responses. As a result the autoscaler scales up very slowly, causing majority of requests to be refused.
One solution that I tried (and got no 503 results back) would be to make the queueDepth relative to the concurrency level but essentially much longer than it is now, e.g.
queueDepth := containerConcurrency * 100
It would be also nice to adjust the queueDepth on the fly according to the number of requests that come in, even if they result in 503 error code.
- is related to
-
SRVKS-66 [DOC] Review the docs about configuring the autoscaler to check if the difference between containerConcurrency and the /target annotation are documented properly
- Closed
- relates to
-
SRVKS-81 [DOC] Review existing autoscaler docs and use as basis for user facing docs in knative/docs repo
- Closed