-
Task
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
2
-
False
-
None
-
False
-
Testable
-
No
-
No
-
No
-
Pending
-
None
-
-
In RHODS service definition and also in the user docs explaining the requirements for installation the current recommendation is to have 2 worker nodes with at least 8 vCPUs and 32 GB of memory per node (for example, AWS instance type m5.2xlarge or larger)
With RHODS 1.24 this configuration allows you to start two Small notebooks and deploy one model using a small model server. Attempting to start additional Small notebooks will not be possible due to Insufficient Cluster Resources
Pod unschedulable
0/7 nodes are available: 1 Insufficient cpu, 1 Insufficient memory, 2 node(s) had untolerated taint {node-role.kubernetes.io/infra: }, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/7 nodes are available: 2 No preemption victims found for incoming pod, 5 Preemption is not helpful for scheduling.
With RHODS 1.25, the inclusion of the data-science-pipelines-operator there are even less resources available, so only two notebooks can be started simultaneously (but not the model)
I think the service definition and the user docs should be updated to explain that 2 worker nodes with at least 8 vCPUs and 32 GB are the bare minimum requirements for installation, but for actual usage of RHODS extra cluster resources will be required.
Reported by: jorge-rhods
- is related to
-
RHODS-7944 Remove kfdef for DSPO from odh-deployer
- Closed
- relates to
-
RHODS-7943 Reduce DSPO resource requests
- Closed
-
RHODS-12317 Add ability to modify replica counts for individual components
- Backlog
- mentioned on