Uploaded image for project: 'AI Platform Core Components'
  1. AI Platform Core Components
  2. AIPCC-11399

Add HA replicas for API and UI before resiliency testing

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • AIPCC Productization
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      https://source.redhat.com/departments/it/datacenter_infrastructure/itcloudservices/itocp/it_paas_blog/itupiad2_resilience_quarterly_test_cy26_q1_will_take_place_on_march_9th_march_13th_2026

      Resiliency testing on prod-stable-spoke1-dc-iad2 runs March 9-13, including an AZ loss test on March 13. Our pods will be directly impacted. Scale dashboard-api and dashboard-ui deployments to 2 replicas and add pod anti-affinity rules to ensure replicas scheduled on different nodes/AZs. This ensures at least one replica survives an AZ shutdown.                                               
      Changes needed:
          - Set replicas: 2 in both API and UI deployment manifests
          - Add podAntiAffinity with preferredDuringSchedulingIgnoredDuringExecution using topology.kubernetes.io/zone as the topology key
          - Verify resource quotas allow 2 replicas per deployment
          - Test that both replicas start and receive traffic

      Acceptance criteria:
          - API and UI running with 2 replicas on different AZs
          - Service continues to respond when one replica is terminated
          - Deployed before March 9

              Unassigned Unassigned
              rhit_jmorenas Jose Angel Morena
              Klara's Team
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: