-
Epic
-
Resolution: Unresolved
-
Blocker
-
None
-
None
-
MTX GA: Architectural Design
-
False
-
False
-
To Do
-
ToDo
-
29% To Do, 14% In Progress, 57% Done
Goal of this Epic is to answer the existing questions around our MTC2 architecture and ensuring that the approach that we choose to implement is capable of servicing our needs. Specifically with an eye towards:
- Feature parity with MTC 1
- Can we demonstrate the current lab scenarios using Crane and the pipeline approach?
- Is an abstraction needed between the front end and the pipeline layer that models a "Migration"?
- Do we need a CRD + controller? Can we simply ship statically defined pipelines that are installed via an operator, that the UI creates PipelineRuns for, given the user inputs?
- How do we minimize downtime in the same manner that MTC achieves using the repeatable stage + incremental transfer / final cutover that MTC uses today?
- Is a simple stateless "discovery service" acceptable for supporting the UI's enhanced experience? Examples of needs:
- Listing the namespaces that are available to be chosen for migration that exist on the source cluster, that a user has access to. Assume we are given the cluster coordinates and credentials as part of the wizard.
- MigAnalytic-like reporting that informs the user what is actually in their source namespace that will be migrated. If a customer has 90k RoleBindings and they aren't aware of this, we need to be able to tell them.
- Providing actual PV usage statistics, as it has been a frequent need that customers may need to resize volume size (up OR down) as part of their migration
- Enhanced validation of options, be able to have a spot that we can build up knowledge of potential problems and proactively look to inform of potential issues.
- How can we enable mass-migration in the future, can what we are building support this?
- How can we support 'state only' migration in future, is our approach sufficient?
Additionally, there are probably gaps or issues that we aren't aware of. To try to surface these, let's pursue a set of non-trivial scenarios that we know we're going to have to be able to handle:
- What is it like to migrate operated workloads from source to target?
- Can we handle complicated stateful workloads that have specific sequencing requirements like the migration of a Galera, high availability MySQL cluster?
We want these questions captured and to have answers for how we plan to approach, and make decisions on how to proceed with the implementation of everything on top of crane that provides the UX experience being discussed with UXD
Also of note:
- Interested in kustomize + argo, how does this fit into the flow?
- Kustomize is expected to be the tool that can layer in cluster specifics in the form of overlays, such as destination namespaces
- Let’s be familiar with kam and what kind of gitops workflow it lays down