Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-1888

Multi-arch Tuning Operator: Cluster-wide architecture preferred/weighted affinity

XMLWordPrintable

    • BU Product Work
    • 3
    • False
    • Hide

      None

      Show
      None
    • False
    • 50% To Do, 50% In Progress, 0% Done
    • 0
    • Program Call

      Feature Overview (aka. Goal Summary)  

      Users will be able to fine-tune how to distribute workloads that support multiple architectures in a mixarch cluster, essentially allowing users to be able to prefer the allocation of workloads on specific architectures more than others.

      Goals (aka. expected user outcomes)

      To add a new field in the API that allows setting the preferredAffinity along with the requiredAffinity

      • Users will be able to prefer the allocation of workloads on specific architectures more than others.
      • In the x86 + arm64 case, this will support a cost-effective deployment by prioritizing arm64 worker nodes and using amd64 nodes primarily for workloads that cannot support arm64.

      Requirements (aka. Acceptance Criteria):

      • A new API is added to automatically set the preferredAffinity cluster-wide with weights chosen by the user

       

      Anyone reviewing this Feature needs to know which deployment configurations that the Feature will apply to (or not) once it's been completed.  Describe specific needs (or indicate N/A) for each of the following deployment scenarios. For specific configurations that are out-of-scope for a given release, ensure you provide the OCPSTRAT (for the future to be supported configuration) as well.

      Deployment considerations List applicable specific needs (N/A = not applicable)
      Self-managed, managed, or both Y
      Classic (standalone cluster) Y
      Hosted control planes Y
      Multi node, Compact (three node), or Single node (SNO), or all All
      Connected / Restricted Network Y
      Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x) Initially x86 and Arm
      Operator compatibility n/a
      Backport needed (list applicable versions) N
      UI need (e.g. OpenShift Console, dynamic plugin, OCM) N
      Other (please specify) n/a

      Use Cases (Optional):

      1. [cost-reduction with arm64] Arm64 CP + Amd64 Workers + Arm64 Workers: minimize the use of amd64 workers by using them primarily for workloads that can't run on arm64.
      2. [P AI accelerator]: reduce the load on P workers when AI workloads need to use the accelerator and prevent times of non-utilization/waste of resources by avoiding P workers remain unused when no AI jobs are running because the others are using taints/tolerations or requiredAffinity.

      Questions to Answer (Optional):

      Include a list of refinement / architectural questions that may need to be answered before coding can begin.  Initial completion during Refinement status.

      <your text here>

      Out of Scope

      High-level list of items that are out of scope.  Initial completion during Refinement status.

      <your text here>

      Background

      Provide any additional context is needed to frame the feature.  Initial completion during Refinement status.

      <your text here>

      Customer Considerations

      Provide any additional customer-specific considerations that must be made when designing and delivering the Feature.  Initial completion during Refinement status.

      <your text here>

      Documentation Considerations

      Provide information that needs to be considered and planned so that documentation will meet customer needs.  If the feature extends existing functionality, provide a link to its current documentation. Initial completion during Refinement status.

      <your text here>

      Interoperability Considerations

      Which other projects, including ROSA/OSD/ARO, and versions in our portfolio does this feature impact?  What interoperability test scenarios should be factored by the layered products?  Initial completion during Refinement status.

      <your text here>

              rhn-support-dhardie Duncan Hardie
              rhn-support-dhardie Duncan Hardie
              Srikanth R Srikanth R
              Prashanth Sundararaman Prashanth Sundararaman
              Alessandro Di Stefano Alessandro Di Stefano
              Duncan Hardie Duncan Hardie
              Jon Thomas Jon Thomas
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: