Uploaded image for project: 'Red Hat OpenShift Data Science'
  1. Red Hat OpenShift Data Science
  2. RHODS-6950

[DSG] Unable to scale down GPUs in a workbench when all GPUs are in use

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • RHODS_1.22.0_GA
    • Integrations, UI
    • False
    • None
    • False
    • Release Notes
    • Testable
    • No
    • No
    • Hide
      == Unable to scale down a workbench's GPUs when all GPUs in the cluster are being used
      It is not possible to scale down a workbench's GPUs if all GPUs in the cluster are being used. This issue applies to GPUs being used by one workbench, and GPUs being used by multiple workbenches.

      *Workaround*: To workaround around this issue, perform the following steps:

      . Stop all running workbenches that are using GPUs.
      . Wait until the relevant GPUs are available again.
      . Edit the workbench and scale down the GPU instances.
      Show
      == Unable to scale down a workbench's GPUs when all GPUs in the cluster are being used It is not possible to scale down a workbench's GPUs if all GPUs in the cluster are being used. This issue applies to GPUs being used by one workbench, and GPUs being used by multiple workbenches. *Workaround*: To workaround around this issue, perform the following steps: . Stop all running workbenches that are using GPUs. . Wait until the relevant GPUs are available again. . Edit the workbench and scale down the GPU instances.
    • Known Issue
    • Done
    • No
    • Pending
    • None
    • Low

      Description of problem:

      It is not possible to scale down GPUs in a notebook if all the GPUs instance are occupied.

      This happens regardless if the GPUs are occupied by the same workbench or others.

      Prerequisites (if any, like setup, operators/versions):

      Create a DS Project

      Deploy a GPU node (1 GPU used to reproduce this bug)

      Steps to Reproduce

      1. create and start a WB with 1 GPU
      2. stop it
      3. create a new WB with 1 GPU and launch it (you may need to wait a bit until the GPUs becomes available again after step 2)
      4. edit the first WB to scale the GPU from 1 to 0 and launch it

              or

      1. create and start a WB with 1 GPU
      2. edit the WB to scale down the GPUs

      *the issue should have a bigger impact in presence of multiple workbenches with multiple GPUs

      Actual results:

      it's not allowed to scale to 0 the GPU in order to launch the workbench

      Expected results:

      possible to scale down to 0 GPUs of a workbench

      Reproducibility (Always/Intermittent/Only Once):

      Always

      Build Details:

      RHODS v1.22.0-2

      Workaround:

      • Stop all the running workbenches which are using the GPUs
      • wait until the GPU availability is captured
      • edit the workbench and scale down the GPU instances

      Additional info:

              rh-ee-fealonso Federico Alonso
              rhn-support-bdattoma Berto D'Attoma
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: