-
Epic
-
Resolution: Won't Do
-
Undefined
-
None
-
None
-
Facilitate downstream disaster recovery
-
False
-
None
-
False
-
Not Selected
-
To Do
-
0% To Do, 0% In Progress, 100% Done
Epic Goal
Downstream consumers of hive, such as ACM, are able to implement disaster recovery (backup/restore) of the hub cluster (where hive-operator resides) such that the hive state on the restored cluster is/becomes identical to its state on the source.
It is acceptable for restored components to be initially absent or different vs. their original state as long as hive (or other OpenShift components we can reasonably expect to exist) is able to reconcile them to their pre-backup state.
Acceptance Criteria
- CI - MUST be running successfully with tests automated. (efried: Really? Are we going to do a DR in CI?)
- Release Technical Enablement - Provide necessary release enablement details and documents. (efried: What does this mean? I think it might be n/a since hive is not part of OCP releases. "Enabling" ACM's release may be a thing, but I don't see us needing to do anything we don't already do for them.)
- Documentation, like a "disaster-recovery.md" in the hive repo.
Dependencies (internal and external)
- Depending on the answer to the CI question above, we may need to rely on ACM for full integration testing.
- We will assume the nature of the DR itself is along the lines of what velero does, i.e. effectively oc get > manifests.yaml to back up, oc create -f manifests.yaml to restore. In particular, a quirk associated with this process is that status sub-objects are not restored. This is an example of something the controllers would have to repopulate.
Previous Work (Optional):
- Hive used to support native backup via velero. We don't anymore. I can't remember why.
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
- DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
DEV - Downstream build attached to advisory: <link to errata>- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Downstream documentation merged: <link to meaningful PR>