Uploaded image for project: 'Red Hat OpenShift AI Requests for Enhancement'
  1. Red Hat OpenShift AI Requests for Enhancement
  2. RHOAIRFE-51

[RFE] Revisiting affinities in AcceleratorProfiles

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None
    • None
    • Dashboard
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      So, since we developed AcceleratorProfiles, and at the time made the decision not to worry about affinities, a use case has been brought to me that makes enough sense to at least talk about revisiting it.

       

      The taints/tolerations workflow works perfectly fine as-is for accelerators, as long as all accelerators are tainted. This does however require tainting all accelerators of a certain type with a given taint. That's perfectly fine, but it also occurred to me that one could separate generations of an accelerator inside a master taint for that accelerator.

       

      For example, tainting all Gaudi1 and Gaudi2 nodes with a Gaudi taint, and then using affinities to select between generations.

       

      Worth at least a quick talk about whether or not this is worth looking into. The other main benefit that comes to mind with this is that it'd let users reuse the existing NFD labels in their affinity, which could be something useful.

            jdemoss@redhat.com Jeff DeMoss
            spryor@redhat.com Sean Pryor
            Andrew Ballantyne, Gage Krumbach
            RHOAI Dashboard
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: