-
Task
-
Resolution: Unresolved
-
Major
-
None
-
None
-
None
-
False
-
-
False
-
-
MongoDB HA Migration – Replace Single Pod with ReplicaSet
Context
MongoDB currently runs as a single Deployment pod, which creates a single point of failure.
During the upcoming ITUPIAD2 resilience test (March 9–13, 2026) an AZ loss scenario will be simulated. If MongoDB remains single-instance, the application will lose its database during the test.
To ensure service continuity, MongoDB must be migrated to a 3-member ReplicaSet deployed via StatefulSet, allowing automatic failover if a zone becomes unavailable.
Goal
Replace the existing single MongoDB Deployment with a 3-member ReplicaSet StatefulSet that:
- survives AZ loss
- provides automatic primary election
- maintains data persistence
- is reachable through a ReplicaSet connection string
Implementation Plan
1. Headless Service
Create a headless service to provide stable DNS identities for each replica.
Example pod DNS names:
{{mongodb-0.mongodb-svc
mongodb-1.mongodb-svc
mongodb-2.mongodb-svc}}
This allows MongoDB members to reliably discover each other.
2. StatefulSet
Replace the current Deployment with a StatefulSet.
Key characteristics:
- Replicas: 3
- Stable network identity
- Persistent storage per member
- Ordered startup
Each pod will get its own persistent volume via volumeClaimTemplates.
Example identity:
{{mongodb-0
mongodb-1
mongodb-2}}
3. ReplicaSet Initialization
A ConfigMap will contain a bootstrap script executed during first startup.
The script will:
- Detect if the replica set already exists.
- If not, run:
{{rs.initiate({
_id: "rs0",
members: [
,
{ _id: 1, host: "mongodb-1.mongodb-svc:27017" },
{ _id: 2, host: "mongodb-2.mongodb-svc:27017" }]
})}}
Initialization should run only once.
4. Pod Anti-Affinity
To ensure resilience during AZ failures:
{{requiredDuringSchedulingIgnoredDuringExecution
topology.kubernetes.io/zone}}
This forces Kubernetes to place each MongoDB member in different availability zones.
If the cluster cannot guarantee 3 zones, fallback to:
preferredDuringSchedulingIgnoredDuringExecution
with a secondary topology on kubernetes.io/hostname.
5. Application Connection Update
Update the MongoDB connection string across all components:
- API
- collector
- agent
Current (standalone):
mongodb://user:pass@mongodb:27017
New ReplicaSet URI:
mongodb://user:pass@mongodb-0.mongodb-svc:27017,mongodb-1.mongodb-svc:27017,mongodb-2.mongodb-svc:27017/?replicaSet=rs0
ReplicaSet URIs allow drivers to automatically:
- detect the primary
- reconnect after failover
6. Data Migration
Existing data from the single MongoDB instance must be migrated.
Proposed method:
1. Deploy the new ReplicaSet alongside the existing standalone
2. Point collectors at the new ReplicaSet, let them run a few cycles to populate it
3. Verify the data looks right
4. Switch API and agent to the new connection string
5. Decommission the old standalone pod
Required Manifests
The following Kubernetes manifests are required:
- Headless Service
-
- mongodb-svc
- ConfigMap
-
- ReplicaSet initialization script
- StatefulSet
-
- 3 replicas
-
- PVC templates
-
- anti-affinity rules
-
- readiness/liveness probes
- Application Updates
-
- API deployment
-
- collector deployment
-
- agent deployment
-
- update MONGODB_URI
Risks
AZ capacity
If the cluster cannot schedule pods across 3 zones, the strict anti-affinity rule may prevent scheduling.
Mitigation:
- switch to preferredDuringSchedulingIgnoredDuringExecution
Resource quotas
Namespace quotas must allow:
- 3 MongoDB pods
- 3 persistent volumes
Storage class compatibility
The storage class must support:
- dynamic PVC provisioning
- StatefulSet volumeClaimTemplates
Acceptance Criteria
The migration will be considered successful when:
- MongoDB runs as a 3-member ReplicaSet
- Each member runs in separate availability zones
- Applications connect via ReplicaSet URI
- Automatic failover works
-
- deleting the primary elects a new one within ~10–12 seconds
- Data persists across pod restarts
- Existing data successfully migrated
- Deployment completed before March 9