Uploaded image for project: 'Multiple Architecture Enablement'
  1. Multiple Architecture Enablement
  2. MULTIARCH-4970

Cluster-wide architecture preferred/weighted affinity

XMLWordPrintable

    • cluster-wide architecture preferred affinity
    • False
    • None
    • False
    • Not Selected
    • NEW
    • To Do
    • 100% To Do, 0% In Progress, 0% Done
    • L
    • 3

      Epic Goal

      • To add a new field in the API that allows setting the preferredAffinity along with the requiredAffinity, such that users can fine-tune how to distribute workloads that support multiple architectures in a mixarch cluster.

      Why is this important?

      • Users will be able to prefer the allocation of workloads on specific architectures more than others.
      • In the x86 + arm64 case, this will support a cost-effective deployment by prioritizing arm64 worker nodes and using amd64 nodes primarily for workloads that cannot support arm64.

      Scenarios
      1. [cost-reduction with arm64] Arm64 CP + Amd64 Workers + Arm64 Workers: minimize the use of amd64 workers by using them primarily for workloads that can't run on arm64.
      2. [P AI accelerator]: reduce the load on P workers when AI workloads need to use the accelerator and prevent times of non-utilization/waste of resources by avoiding P workers remain unused when no AI jobs are running because the others are using taints/tolerations or requiredAffinity.

      Acceptance Criteria

      • A new API is added to automatically set the preferredAffinity cluster-wide with weights chosen by the user

      Dependencies (internal and external)
      1. …

      Previous Work (Optional):
      1. …

      Open questions::
      1. Does the current implementation schedulingGates allow to amend the preferredAfinity?

      Done Checklist

      • CI - For new features (non-enablement), existing Multi-Arch CI jobs are not broken by the Epic
      • Release Enablement: <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR orf GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - If the Epic is adding a new stream, downstream build attached to advisory: <link to errata>
      • QE - Test plans in Test Plan tracking software (e.g. Polarion, RQM, etc.): <link or reference to the Test Plan>
      • QE - Automated tests merged: <link or reference to automated tests>
      • QE - QE to verify documentation when testing
      • DOC - Downstream documentation merged: <link to meaningful PR>
      • All the stories, tasks, sub-tasks and bugs that belong to this epic need to have been completed and indicated by a status of 'Done'.

            tzivkovi@redhat.com Tori Zivkovic
            rhn-support-adistefa Alessandro Di Stefano
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated: