Uploaded image for project: 'Multiple Architecture Enablement'
  1. Multiple Architecture Enablement
  2. MULTIARCH-4970

Cluster-wide architecture preferred/weighted affinity

XMLWordPrintable

    • cluster-wide architecture preferred affinity
    • Product / Portfolio Work
    • OCPSTRAT-1888Multi-arch Tuning Operator: Cluster-wide architecture preferred/weighted affinity
    • 0% To Do, 0% In Progress, 100% Done
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • Hide
      • [5 Mar] <GREEN> Green
        • Last PRs merged. Pending documentation and Technical Enablement slides.
      • [26 Feb] <YELLOW> Yellow
        1 pr pending. Waiting on e2e tests
        Aim is to have this done by end of week
      • [12 Feb] <GREEN> GREEN
        • Last 2 prs are pending.
      • [29 Jan] <GREEN> GREEN
        • Pending 1 pr merge, 1 dev complete, 1 still in progress
      • [22 Jan] <GREEN> GREEN
      Show
      [5 Mar] <GREEN> Green Last PRs merged. Pending documentation and Technical Enablement slides. [26 Feb] <YELLOW> Yellow 1 pr pending. Waiting on e2e tests Aim is to have this done by end of week [12 Feb] < GREEN > GREEN Last 2 prs are pending. [29 Jan] < GREEN > GREEN Pending 1 pr merge, 1 dev complete, 1 still in progress [22 Jan] < GREEN > GREEN
    • L
    • None
    • None

      Epic Goal

      • To add a new field in the API that allows setting the preferredAffinity along with the requiredAffinity, such that users can fine-tune how to distribute workloads that support multiple architectures in a mixarch cluster.

      Why is this important?

      • Users will be able to prefer the allocation of workloads on specific architectures more than others.
      • In the x86 + arm64 case, this will support a cost-effective deployment by prioritizing arm64 worker nodes and using amd64 nodes primarily for workloads that cannot support arm64.

      Scenarios
      1. [cost-reduction with arm64] Arm64 CP + Amd64 Workers + Arm64 Workers: minimize the use of amd64 workers by using them primarily for workloads that can't run on arm64.
      2. [P AI accelerator]: reduce the load on P workers when AI workloads need to use the accelerator and prevent times of non-utilization/waste of resources by avoiding P workers remain unused when no AI jobs are running because the others are using taints/tolerations or requiredAffinity.

      Acceptance Criteria

      • A new API is added to automatically set the preferredAffinity cluster-wide with weights chosen by the user

      Dependencies (internal and external)
      1. …

      Previous Work (Optional):
      1. …

      Open questions::
      1. Does the current implementation schedulingGates allow to amend the preferredAfinity?

      Done Checklist

      • CI - For new features (non-enablement), existing Multi-Arch CI jobs are not broken by the Epic
      • Release Enablement: <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR orf GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - If the Epic is adding a new stream, downstream build attached to advisory: <link to errata>
      • QE - Test plans in Test Plan tracking software (e.g. Polarion, RQM, etc.): <link or reference to the Test Plan>
      • QE - Automated tests merged: <link or reference to automated tests>
      • QE - QE to verify documentation when testing
      • DOC - Downstream documentation merged: <link to meaningful PR>
      • All the stories, tasks, sub-tasks and bugs that belong to this epic need to have been completed and indicated by a status of 'Done'.

              tzivkovi@redhat.com Tori Zivkovic
              rhn-support-adistefa Alessandro Di Stefano
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: