Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-7197

Active-Active HA GitLab Runner Deployment in OCP

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • AIPCC Productization
    • None
    • Active-Active HA GitLab Runner Deployment in OCP
    • False
    • Hide

      None

      Show
      None
    • False
    • To Do
    • 71% To Do, 0% In Progress, 29% Done

      We are currently using the aipcc GitLab runner for general-purpose CI jobs such as linting, formatting, or small validation steps. Each of these lightweight jobs triggers the provisioning of a dedicated t3.medium EC2 instance. This results in unnecessary resource consumption, slower startup times, and increased operational cost, spinning up a full VM per job is excessive for workloads that typically run in a short amount of time.

      To improve efficiency and resiliency, we need to deploy a highly available GitLab Runner solution on Kubernetes (or alternatively, a single EC2 instance capable of running multiple jobs concurrently). This will allow multiple small CI tasks to execute without provisioning new infrastructure for each run, ensuring faster execution and significantly reduced cloud costs.

      The new runner architecture must support:

      • High availability and automatic failover (active-active) across clusters.
      • Shared registration under the same GitLab group.
      • Secure token management via HashiCorp Vault.

      The goal is to deliver a self-managed, cost-efficient, and reliable CI execution environment while improving pipeline performance and reducing AWS expenses.

              rhit_jmorenas Jose Angel Morena
              rhit_jmorenas Jose Angel Morena
              Klara's Team
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: