Uploaded image for project: 'Ansible Automation Platform RFEs'
  1. Ansible Automation Platform RFEs
  2. AAPRFE-2616

[RFE] Job Prioritization and Organization-Level Concurrency Limits for Shared Container Groups

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 2.5, 2.6
    • controller
    • False
    • Hide

      None

      Show
      None
    • False

      1. What is the nature and description of the request?
      The customer is requesting an enhancement to the AAP's job scheduler to allow for more granular control over job concurrency and prioritization beyond the current First-In-First-Out (FIFO) model.

      Currently, to limit concurrency per Organization, the Engineering recommendation AAPRFE-423 is to create specific Instance/Container Groups per Organization. However, the customer requires a mechanism to control concurrency and prioritize workloads (e.g., by Organization or Job Template) within a shared, global Container Group or Instance Group. They need a sequencing model based on job attributes rather than a strict FIFO queue to manage shared execution resources effectively.

      2. Why does the customer need this? (List the business requirements here)

      • Administrative Efficiency: The current workaround of creating separate Container Groups for every Organization to manage concurrency is viewed as "extremely complex to control and manage." It forces the customer to slice up global resources rigidly, making it difficult to ensure overall concurrency does not exceed available worker node resources while still serving all Organizations.
      • Business Prioritization: The customer lacks an effective way to "throttle" workloads based on business value. They need to ensure that high-priority Organizations or critical Job Templates take precedence over lower-priority tasks within the shared resource pool, rather than being blocked by a backlog of low-priority jobs in a FIFO queue.

      3. How would you like to achieve this? (List the functional requirements here)

      • Organization-Level Throttling in Shared Groups: The ability to set a default "Max Concurrency" for Organizations that applies to their usage of shared Instance/Container groups, with the ability to override this per Organization.
      • Attribute-Based Prioritization: Implementation of a queuing logic that allows jobs to be prioritized based on attributes (e.g., Organization Priority, Job Template Priority) rather than just submission time.
      • Global Resource Awareness: A mechanism that allows the "Global" container group to intelligently throttle incoming jobs based on the defined priorities of the source Organization, ensuring high-priority workloads are executed first when resources become available.

      4. List any affected known dependencies: Doc, UI etc..

      • UI: Updates to Organization and Job Template views to allow the setting of Priority levels or Concurrency limits.
      • Scheduler: Updates to the task manager/scheduler logic to move from FIFO to Priority-Weighted queuing.
      • Docs: Documentation on how priority weights interact with available capacity.

      5. Github Link if any
      N/A

              Unassigned Unassigned
              rh-ee-mtipton Michael Tipton
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: