-
Story
-
Resolution: Done
-
Major
-
RHODS_1.4.0_GA
-
2
-
False
-
False
-
Release Notes
-
No
-
-
-
-
-
-
1.3.0-6
-
No
-
OpenShift Data Science services now try to avoid being scheduled on the same node so that OpenShift Data Science components are more failure resistant.
-
Enhancement
-
No
-
Yes
-
None
-
IDH Sprint 12, IDH Sprint 13
We need to add anti-affinities for all the rhods application pods. On a cursory glance, this list comprises:
- JupyterHub server
- Traefik proxy
- RHODS dashboard
We can (and should) go upstream first on this.
Without this change, OpenShift may try to schedule multiple pods for the same service (e.g. multiple of the odh dashboard, traefik, or JupyterHub server pods) on the same underlying OpenShift node. With this change, OpenShift will prefer to schedule them on separate nodes but fall back to allowing them to run on the same node if it is impossible to schedule the pods on separate nodes. There will be no functional change introduced with this change, it just makes the RHODS components more resilient to OpenShift node failure/downtime.