-
Feature
-
Resolution: Done
-
Critical
-
None
-
BU Product Work
-
False
-
-
False
-
0% To Do, 0% In Progress, 100% Done
-
M
-
0
-
Program Call
-
-
-
-
This is a big change from what is long standing support policy/stance for clusters; TE should be provided for awareness.
-
-
-
Feature Overview (aka. Goal Summary)
Customers with hard requirements for active-active deployments across two locations requiring to support stateful traditional applications (e.g., OCP Virt VMs that can only run a single instance) have dependencies on the underlying infrastructure to provide the availability. These use cases are common when deploying the VMs on traditional virtualization stacks. An OpenShift cluster is deployed as a stretched or spanned cluster with a control-plane distribution of 2+1 or 1+1+1 (when using an arbitrary site) to support those scenarios.
During failure scenarios on the data center hosting the majority of control plane nodes, the surviving control plane node becomes the only node with the latest configuration and state of all the objects/resources on the cluster. The recovery procedure in a disaster scenario for this configuration requires the single surviving node to become read-write and to have the only copy of the etcd. Should that node fail, it will be a catastrophic failure. This is more critical when OCP-Virt is also hosting the stateful VMs.
To increase resiliency and reduce risk for this scenario during this type of failure, we need to extend the number of control plane nodes to support 2+2 and 3+2 deployments. In this scenario, a failure of a site with the majority of the nodes will still have two copies of etcd in read-only in the surviving location, providing higher assurance for the recoverability of the cluster.
Today, the cluster-etcd-operator can handle up to 5-etcd instances when detecting up to 5 control plane nodes. This procedure is used as part of the automation during vertical scaling of the control plane on environments with control-plane MachineSet. For deployments where MachineSet is not available (e.g., bare-metal, agent-based installer), the cluster-etcd-operator is not automatically triggered for doing the vertical scaling of the control plane but will scale the etcd-peers if the control-plane nodes are manually added to the environment. This is the procedure we want to validate and support for bare-metal clusters with stretched control-planes.
This feature is only for baremetal and mainly for OCP Virtualization use cases.
Goals (aka. expected user outcomes)
- Validate and support the use of 4-nodes and 5-nodes control-plane architecture for bare-metal clusters on stretched control-plane configurations and the following restrictions:
-
- bare-metal control-plane
- bare-metal deployment using assisted-service or agent-based installer
- Same Layer3 network across locations
- Max Latency across nodes < 10ms
- Min bandwidth 10Gbps across nodes
- etcd must be on an SSD or NVME disk
Requirements (aka. Acceptance Criteria):
- Performance and scaling should have minimal (<10%) degradation when compared to perf tests on existing HA clusters
- Validate and update documentation on manual recovery procedures on the control plane in case of quorum loss
Anyone reviewing this Feature needs to know which deployment configurations that the Feature will apply to (or not) once it's been completed. Describe specific needs (or indicate N/A) for each of the following deployment scenarios. For specific configurations that are out-of-scope for a given release, ensure you provide the OCPSTRAT (for the future to be supported configuration) as well.
Deployment considerations | List applicable specific needs (N/A = not applicable) |
Self-managed, managed, or both | self-managed |
Classic (standalone cluster) | Classic |
Hosted control planes | N/A |
Multi node, Compact (three node), or Single node (SNO), or all | multi-node |
Connected / Restricted Network | N/A |
Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x) | x86_64 |
Operator compatibility | N/A |
Backport needed (list applicable versions) | no |
UI need (e.g. OpenShift Console, dynamic plugin, OCM) | unknown |
Other (please specify) | Observability (Update to Prometheus rules for control-plane) |
Use Cases
The only use case under consideration are standard multi-node bare-metal deployments with stretched control-plane installed with assisted-service/agent-based installer.
Out of Scope
Any other use-case or installation mode.
Documentation Considerations
The documentation must include a clear step-by-step validated recovery procedure for quorum loss.
- relates to
-
RFE-5311 Allow configuration of Hosted Control Plane component replicas
- Backlog
-
RFE-5310 Allow providing Agent labels from BareMetalHost object
- Accepted
-
OCPSTRAT-539 Enhance recovery procedure for full control plane failure
- In Progress
-
OCPSTRAT-1219 Allow 5-node control planes in day 1 with Agent-Based Installer
- In Progress
-
OCPSTRAT-1395 Automated control-plane recovery from expired certificates (hibernation)
- In Progress
- links to
- mentioned on