-
Feature
-
Resolution: Unresolved
-
Critical
-
None
-
None
-
Strategic Portfolio Work
-
False
-
-
False
-
-
25% To Do, 75% In Progress, 0% Done
-
0
Feature Summary:
The LeaderWorkerSet (LWS) API is designed for deploying and managing groups of pods as a unified replication unit, known as a "super pod." This capability is especially suited for AI/ML inference workloads, where large language models (LLMs) and multi-host inference workflows require sharded models across multiple devices and nodes. The LWS API allows OpenShift to manage distributed inference workloads, where a single leader pod coordinates multiple worker pods, enabling streamlined orchestration for complex AI tasks with high compute and memory demands.
Use Case:
For AI workloads that require distributed inference—such as LLMs or deep learning models with sharding across devices—LWS provides a structured way to orchestrate model replicas with both leaders and workers in a defined topology. This feature enables OpenShift users to deploy sharded AI workloads where models are divided across multiple nodes, providing the flexibility, scalability, and fault tolerance necessary to process large-scale inference requests efficiently.
https://github.com/kubernetes-sigs/lws
https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/llamacpp
https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm/GPU
Requirement for operator
-
- 1) disconnected
- 2) FIPS
- 3) Multi arch -> Arm
- 4) HCP -> ability to run operator in infra/worker node
- 5) Konflux
- 6) ability to deploy this operator in non openshift NS
Hypershift ROSA/ARO/OSD requirement -> for all operators
- operator can run on infra/worker node
- do not modify Machine config
- can be installed in non *openshift NS
- is build and tested via Konflux