Uploaded image for project: 'OpenShift Etcd'
  1. OpenShift Etcd
  2. ETCD-423

Restore Test - Lost Quorum

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • ETCD Sprint 235, ETCD Sprint 236

      For testing the automated backups feature we will require an e2e test that validates the backups by ensuring the restore procedure works for a quorum loss disaster recovery scenario.

      See the following doc for more background:
      https://docs.google.com/document/d/1NkdOwo53mkNBCktV5tkUnbM4vi7bG4fO5rwMR0wGSw8/edit?usp=sharing

      This story targets the milestone 2,3 and 4 of the restore test to ensure that the test has the ability to perform a backup and then restore from that backup in a disaster recovery scenario.

      While the automated backups API is still in progress, the test will rely on the existing backup script to trigger a backup. Later on when we have a functional backup API behind a feature gate, the test can switch over to using that API to trigger backups.

      In ETCD-417 we've introduced a simple backup+restore process on a crashlooping member (1/3). This ticket requires to have 2/3 members being non-functional, with the third member acting as a single-node recovery anchor. From there one should be reconstructing then whole control plane into a working state.

            tjungblu@redhat.com Thomas Jungblut
            rhn-coreos-htariq Haseeb Tariq
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: