Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-1887

Tech Preview: Dynamic Accelerator slicer Operator (fka: InstaSlice)

XMLWordPrintable

    • Strategic Portfolio Work
    • False
    • Hide

      None

      Show
      None
    • False
    • OCPSTRAT-1692AI Workloads for OpenShift
    • 33% To Do, 67% In Progress, 0% Done
    • 0
    • Program Call

      Feature Overview (aka. Goal Summary)  

      As an OpenShift administrator looking to run AI workloads on the platform, efficient GPU utilization is crucial given the high cost of these resources. While NVIDIA GPUs offer a method to pre-slice the GPU for multiple workloads, this approach can lead to resource wastage if the slicing does not align with the actual workload demands.

      Therefore, I want to dynamically slice the GPU based on the specific requirements of each workload, ensuring optimal utilization and minimizing resource waste.

       

      Goal 

      TP in 4.19

      Acceptance criteria 

      • operator can run on infra/worker node
      • Operator should not modify Machine config
      • can be installed in non *openshift NS
      • is build and tested via Konflux
      • FIPS complient
      • should work in disconnected mode

      Non-Goals:

      • Share MIG slices among multiple containers
      • Achieve scheduling latency below 5s (need help from RH team)
      • SNO/MicroShift testing is out of scope 

              gausingh@redhat.com Gaurav Singh
              gausingh@redhat.com Gaurav Singh
              Harshal Patil, Sai Ramesh Vanka
              Harshal Patil Harshal Patil
              Aruna Naik Aruna Naik
              Daniel Macpherson Daniel Macpherson
              Mrunal Patel Mrunal Patel
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: