Uploaded image for project: 'Red Hat OpenShift Data Science'
  1. Red Hat OpenShift Data Science
  2. RHODS-4942

Update GPU node taint info in docs

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Normal Normal
    • RHODS_1.18.0_GA
    • None
    • Documentation
    • None

      If you taint your GPU machine pool with the nvidia.com/gpu taint, the GPU stuff keeps working and your notebooks can land on [GPU nodes].
      This is already the documented recommendation:
      https://access.redhat.com/documentation/en-us/red_hat_openshift_data_science/1/html/m[…]ces/enabling-gpu-support-in-openshift-data-science_user-mgmt

      However, the scaled-down/scale-up part of this, in the doc, is no longer required. Taints are now auto-applied and auto-removed from running machines almost immediately.

      Most of the second paragraph of the introduction to this module can now be removed:

      Red Hat recommends that you use a separate machine pool for GPU nodes that have the nvidia.com/gpu NoSchedule taint. If you edit an existing machine pool to add this taint, you must first scale the machine pool down to zero nodes, and then increase the machine pool to the number of nodes that you require. This ensures that the new taint is applied to all nodes in the machine pool. To ensure consistent behavior across all nodes in the machine pool, Red Hat recommends that you increase the scale of your machine nodes promptly. As scaling nodes to zero has a disruptive effect on your deployment, Red Hat recommends that you perform this action as soon as possible, while considering your service usage patterns when selecting an appropriate time.

            rhn-support-rwolfgan Riley Wolfgang (Inactive)
            rhn-ecs-lbailey Laura Bailey
            Luca Giorgi Luca Giorgi
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: