-
Story
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
BU Product Work
-
5
-
False
-
None
-
False
-
OCPSTRAT-403 - Automated backups of etcd (local destination)
-
-
-
ETCD Sprint 235, ETCD Sprint 236
For testing the automated backups feature we will require an e2e test that validates the backups by ensuring the restore procedure works for a quorum loss disaster recovery scenario.
See the following doc for more background:
https://docs.google.com/document/d/1NkdOwo53mkNBCktV5tkUnbM4vi7bG4fO5rwMR0wGSw8/edit?usp=sharing
This story targets the milestone 2,3 and 4 of the restore test to ensure that the test has the ability to perform a backup and then restore from that backup in a disaster recovery scenario.
While the automated backups API is still in progress, the test will rely on the existing backup script to trigger a backup. Later on when we have a functional backup API behind a feature gate, the test can switch over to using that API to trigger backups.
In ETCD-417 we've introduced a simple backup+restore process on a crashlooping member (1/3). This ticket requires to have 2/3 members being non-functional, with the third member acting as a single-node recovery anchor. From there one should be reconstructing then whole control plane into a working state.