-
Feature
-
Resolution: Done
-
Critical
-
None
-
Strategic Product Work
-
False
-
-
False
-
OCPSTRAT-1408Comprehensive Automation of HCP Backup and Restore Processes for Enhanced Scalability and Reliability
-
17% To Do, 0% In Progress, 83% Done
-
M
-
7
-
0
-
Program Call
Feature Overview (Goal Summary)
This feature introduces automatic Etcd snapshot functionality for self-managed hosted control planes, expanding control and flexibility for users. Unlike managed hosted control planes, self-managed environments allow for customized configurations. This feature aims to enable users to leverage any S3-compatible storage for etcd snapshot storage, ensuring high availability and resilience for their OpenShift clusters.
Goals (Expected User Outcomes)
- Primary User Persona: Cluster Service Providers
- User Benefit: Enhanced data protection and quicker disaster recovery for Hosted Clusters clusters through automated etcd snapshots.
Requirements (Acceptance Criteria)
- Automatic Snapshot Creation: Etcd snapshots must be taken automatically at regular intervals.
- S3 Storage: Support for any S3-compatible storage for snapshot storage.
- Snapshot Rotation and Retention Policy: Snapshots are rotated/removed after a specified period to manage storage efficiently.
- Restoration SOP: Standard Operating Procedures for Etcd restoration should be established, targeting a recovery time objective (RTO) of approximately 1 hour at max. Preferrearbly automated as well.
- Metrics: Track Mean Time to Recovery (MTTR) for improved reliability. Do we have metrics?
- depends on
-
OCPBUGS-35385 After recovering from etcd backup, ovnkube-node pod which located in the lost control plane host in CrashLoopBackOff state
- Closed
- is blocked by
-
HOSTEDCP-1868 CAPI Cluster object should be paused when HostedCluster.Spec.PausedUntil is set
- Closed
- is cloned by
-
OCPSTRAT-1720 Cross Management Clusters Backup/restore for Hosted Clusters for Self-Managed HCP
- In Progress
-
OCPSTRAT-1409 Auto backup/restore for Hosted Clusters for Self-Managed HCP Part II
- In Progress
- relates to
-
OCPSTRAT-1393 Allow recreation of an AgentCluster without reprovisioning worker nodes for backup and restore
- Closed
- links to
- mentioned in
-
Page Loading...