Uploaded image for project: 'OpenShift Container Platform (OCP) Strategy'
  1. OpenShift Container Platform (OCP) Strategy
  2. OCPSTRAT-529

Improve disaster recovery test coverage for etcd

    XMLWordPrintable

Details

    • False
    • OCPSTRAT-16OpenShift - Kubernetes and Core Platform
    • 100
    • 100% 100%
    • 0
    • 0
    • Program Call

    Description

      Goal

      Note: This is an internal improvement. There are no user-facing deliverables.

      There are a few areas to cover for Disaster Recovery (DR):

      • Finish rewriting the existing DR Bash scripts in Go
      • Add guardrails to code that will not allow the customer to cause additional damage to cluster during disaster recovery.
      • Cleanup technical debt from MCO repo and installer.

      Why is this important?

      When a cluster has an event that for example results in quorum loss this is a very stressful situation. If we can provide a very clean solution to this event with well thought out tools the admin will be pleased.

      So we don't run into customer situations like this
      https://docs.google.com/document/d/1ULGQARWdxjujWpSyncY0pKrUG9OcT0PlhEmYVwrPEAE/edit?ts=5eb18ea3

      Scenarios

      1. customer has a cluster event that causes loss of quorum

      Attachments

        Issue Links

          Activity

            People

              wcabanba@redhat.com William Caban
              blomquisg Greg Blomquist
              Dean West
              ge liu ge liu
              Matthew Werner Matthew Werner
              David Eads David Eads
              Eric Rich Eric Rich
              Votes:
              12 Vote for this issue
              Watchers:
              45 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: