Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-1740

Enabling AI Workloads with LeaderWorkerSet (LWS) API in OpenShift

XMLWordPrintable

    • BU Product Work
    • False
    • Hide

      None

      Show
      None
    • False
    • 100% To Do, 0% In Progress, 0% Done
    • 0

      Feature Summary:
      The LeaderWorkerSet (LWS) API is designed for deploying and managing groups of pods as a unified replication unit, known as a "super pod." This capability is especially suited for AI/ML inference workloads, where large language models (LLMs) and multi-host inference workflows require sharded models across multiple devices and nodes. The LWS API allows OpenShift to manage distributed inference workloads, where a single leader pod coordinates multiple worker pods, enabling streamlined orchestration for complex AI tasks with high compute and memory demands.

      Use Case:
      For AI workloads that require distributed inference—such as LLMs or deep learning models with sharding across devices—LWS provides a structured way to orchestrate model replicas with both leaders and workers in a defined topology. This feature enables OpenShift users to deploy sharded AI workloads where models are divided across multiple nodes, providing the flexibility, scalability, and fault tolerance necessary to process large-scale inference requests efficiently.

      https://github.com/kubernetes-sigs/lws 

      https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/llamacpp 

      https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm/GPU 

            gausingh@redhat.com Gaurav Singh
            gausingh@redhat.com Gaurav Singh
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: