Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-1408

Comprehensive Automation of HCP Backup and Restore Processes for Enhanced Scalability and Reliability

XMLWordPrintable

    • 40% To Do, 0% In Progress, 60% Done
    • False
    • Hide

      None

      Show
      None

      Outcome Overview

      Ensure comprehensive and seamless backup and restoration capabilities for Hosted Clusters in Self-Managed HCP, facilitating scalable deployment and minimizing the impact on workloads during recovery operations.

       

      Once all designated Features and/or Initiatives under this Outcome are successfully implemented, we would have added value for our customers by giving them a solid end to end story for business continuity and disaster recovery. This initiative is expected to boost customer confidence in employing HCP at scale by simplifying the backup and restoration process, thereby directly supporting Red Hat's strategic goal of enhanced customer satisfaction and product reliability.

      Success Criteria

      For this Outcome to be deemed successful, the following must be true:

      1. Hosted Clusters must be capable of auto backup and restoration without manual intervention.
      2. Recreating an AgentCluster should be feasible without the need to reprovision worker nodes.
      3. Seamless automation of backup and restore functionalities must be established across different clusters and regions through ACM (which would also make a great upsell),

      Expected Results (what, how, when)

      • Unblocking critical sales of OCP: Reduction in customer concerns regarding workload impacts during backup and restore processes, potentially unblocking sales barriers for large-scale multi-cluster deployments (that benefit from HCP's efficiency). Customer/field surveys can be used as feedback 60 days post-implementation.

       

      • Product Metrics: Improved automation in backup and restore processes should lead to increased adoption rates and reduced downtime during critical operations. Adoption rate metrics and downtime incident reports can be reviewed 90 days after completion to inform the success of this outcome.

              Unassigned Unassigned
              azaalouk Adel Zaalouk
              Antoni Segura Puimedon, Crystal Chun, David Vossel, Fabian Deutsch, Joshua Packer, Juan Manuel Parrilla Madrid, Nick Carboni, Roke Jung
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: