-
Task
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
False
-
-
False
-
-
Resiliency testing on prod-stable-spoke1-dc-iad2 runs March 9-13, including an AZ loss test on March 13. Our pods will be directly impacted. Scale dashboard-api and dashboard-ui deployments to 2 replicas and add pod anti-affinity rules to ensure replicas scheduled on different nodes/AZs. This ensures at least one replica survives an AZ shutdown.
Changes needed:
- Set replicas: 2 in both API and UI deployment manifests
- Add podAntiAffinity with preferredDuringSchedulingIgnoredDuringExecution using topology.kubernetes.io/zone as the topology key
- Verify resource quotas allow 2 replicas per deployment
- Test that both replicas start and receive traffic
Acceptance criteria:
- API and UI running with 2 replicas on different AZs
- Service continues to respond when one replica is terminated
- Deployed before March 9