Uploaded image for project: 'Performance and Scale for AI Platforms'
  1. Performance and Scale for AI Platforms
  2. PSAP-523

Prepare slide deck for PerfConf Fall '21 on A100 work

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Done
    • Icon: Undefined Undefined
    • July Release for PSAP
    • None
    • None
    • False
    • False
    • PSAP Sprint 210

      Abstract:

      NVIDIA Ampere GPUs (A100 and A30) have the unique feature of supporting dynamic slicing of the GPU into multiple GPU instances (MIG), running in isolation (guaranteed QoS) from one another.

      In this presentation, we first present the work we did in collaboration with NVIDIA to support MIG reconfiguration in the GPU Operator. This reconfiguration is triggered by a simple update of the node label.

      In the second part of the session, we present an AI/ML benchmarking of the GPU, where we measure the computing performance of the different instance sizes. We also validate the isolation of the instances by running multiple workloads in parallel.

              kpouget2 Kevin Pouget
              kpouget2 Kevin Pouget
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: