Uploaded image for project: 'OpenShift Etcd'
  1. OpenShift Etcd
  2. ETCD-490

No-config automated backups of etcd

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • None
    • No-config automated backups of etcd
    • To Do
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • Not Selected
    • None
    • None
    • None

      Epic Goal*

      Automated backups of etcd should have a default or no-config option that saves etcd backups durably without requiring the admin to provision or configure PersistentVolume storage. 

       
      Why is this important? (mandatory)

      With https://issues.redhat.com/browse/ETCD-81 the tech-preview API to configure automated backups of etcd was introduced. This API currently requires the user to provision local or remote persistent volume storage before the etcd-operator can be configured to store periodic backups to the user provided storage (specified as a PersistentVolumeClaim (PVC) in the config.openshift.io/v1alpha1 Backups API object).

      To support a no-opinion/no-config option where the user does not provide their own PVC, the operator would have to save the backups locally on the cluster and distribute or replicate the backups across all control-plane nodes to reduce the risk of losing backups in a disaster recovery scenario when one or more nodes are inaccessible.

      Having a no-config option reduces the friction in enabling automated backups and gets us closer to having automated backups on by default which improves disaster recovery outcomes for openshift clusters. 

      Depending on how the no-config option is implemented (local or hostpath type PVs) we would have to consider how the operator provisions storage and retain the backups across all control-plane nodes to ensure we have a sane retention policy that doesn't exhaust disk space.

      See background: https://github.com/openshift/api/pull/1482#discussion_r1261840823

       
      Scenarios (mandatory) 

      Provide details for user scenarios including actions to be performed, platform specifications, and user personas.  

      1. The admin can create a Backup CR without specifying a storage or schedule and expect have backups saved locally on the control-plane nodes' filesystem.

       
      Dependencies (internal and external) (mandatory)

      The etcd team is responsible for updating the APIs and cluster-etcd-operator to support this feature.

      Contributing Teams(and contacts) (mandatory) 

      Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.

      • Development - etcd-team
      • Documentation - etcd docs team (Laura Hinson)
      • QE - etcd qe (Sandeep Kundu)
      • PX - 
      • Others -

      Acceptance Criteria (optional)

      Create a config.openshift.io/v1alpha1 Backups CR and confirm that etcd backups are saved periodically to the local storage of control-plane nodes. 

      Drawbacks or Risk (optional)

      Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.

      Done - Checklist (mandatory)

      The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.

      • CI Testing -  Basic e2e automationTests are merged and completing successfully
      • Documentation - Content development is complete.
      • QE - Test scenarios are written and executed successfully.
      • Technical Enablement - Slides are complete (if requested by PLM)
      • Engineering Stories Merged
      • All associated work items with the Epic are closed
      • Epic status should be “Release Pending” 

              Unassigned Unassigned
              rhn-coreos-htariq Haseeb Tariq
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: