-
Task
-
Resolution: Done
-
Critical
-
None
-
None
-
None
-
False
-
-
False
-
-
Background
By default Serving sets only the cpu requests for queue proxy, which is set to 25m.
There are options to change that for all services by setting/enabling in config-deployment:
- Sets the queue proxy's memory request.
- If omitted, no value is specified and the system default is used.
queue-sidecar-memory-request: "400Mi"
- Sets the queue proxy's memory limit.
- If omitted, no value is specified and the system default is used.
queue-sidecar-memory-limit: "800Mi"
Also you can set that per service via queue.sidecar.serving.knative.dev/resourcePercentage annotation but that
does not work as expected: https://github.com/knative/serving/issues/7349#issuecomment-602125160.
The full algo of how resources are adjusted is here: https://github.com/openshift-knative/serving/blob/release-v1.7/pkg/reconciler/revision/resources/queue.go#L93
RHODS competitor Arrikto has already tested defaults for ML workloads:
https://docs.arrikto.com/user/serving/performance/configure.html#queue-proxy-sidecar
See long thread here: https://redhat-internal.slack.com/archives/CD87JDUB0/p1678287529899759
Not having defaults stops users from using resource quotas (customers have asked about it).
As a side note SM sets the following by default for injected side-cars:
"resources": {
"limits": {
"cpu": "2",
"memory": "1Gi"
},
"requests": {
"cpu": "10m",
"memory": "128Mi"
}
},
User Story
As a user I would like to be able to configure kservice resources including queue proxy
and understand how they relate.
AIs:
- restart discussion upstream if we need to set defaults for common use cases
- document upstream/downstream how resources are configured
- is documented by
-
SRVKS-1040 [DOC] Document resources for ksvcs including queue proxy
-
- Closed
-