Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:

Blocked:
False
Blocked Reason:
None
Ready:
False
Color Status:
Not Selected
Portfolio Solutions:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

1. Proposed title of this feature request

Geo-Redundancy for SNO

2. What is the nature and description of the request?

SNO clusters running in multiple sites / availability zones to provide Geo-Redundancy. In case of a site failure, another takes over.

The following requirements should be met:

Geo-Redundancy (GR) is needed to ensure there is service continuity (e.g. when a particular site goes down the partner site takes over)
The following GR models should be evaluated:
- Active-Active
- Active-Standby
RPO (Recover Point Objective) must ensure there is no data loss.
RTO (Recover Time Objective) must be 5 minutes or less.
Redundancy concept should be transparent for applications and offload the pod.
Active Site and Partner Site will be commissioned individually. It is expected that all applications on the Active Site will be mirrored to the Partner Site, including any future updates of the applications.
At any given point in time, application data need to be in sync between both Sites. Data synchronization should take place between both system as both serve traffic.
Support for N:1 GR model, where N (max:2) is the number of Active Sites and 1 represents the Partner Site. At any given point in time, the Partner Site should be able to act as an Active Site:
- 1+1 - 2 Sites A/A or A/S
- 2+1 - 2 Sites A/A or A/S, 1 site available to replace one of 2 sites which handle commercial traffic
Load Balancing is required for seamless access to the both Sites.
GR solution to be supported on Bare Metal OCP clusters.
Switchover and Failover use cases should be supported. Switchover is when both Sites are running and the active role is moved from one to the other. Failover is when the Primary Site is fully down and irrecoverable and all traffic is directed to the Partner Site.
Auto-failover should be possible when Active Site goes down.
When a system recovers from failure, it should be synchronized with the system that acts as the Active Site before it is allowed to serve traffic again.
The GR solution should still work as expected even if the two Sites are on different OCP versions (minor version difference only):
- Different OCP Z-stream releases, e.g. Site 1: v.4.9.11, Site 2: v.4.9.20
- Different OCP Y-stream releases, e.g. Site 1: v.4.9.11, Site 2: v.4.10.4
Backup and Recovery for both Sites must be possible.

3. Why does the customer need this? (List the business requirements here)

Geo-Redundancy is required in cases where applications are running on SNO clusters, so that outages have less impact. The redundancy concept must be agnostic and transparent to the applications running on those nodes.

4. List any affected packages or components.

Red Hat OpenShift Container Platform

is related to

RFE-2803 Geo-Redundant deployment of multi-node OCP clusters

Rejected

Assignee:: Daniel Fröhlich

Reporter:: Demetris Vassiliades

Votes:: 0 Vote for this issue

Watchers:: 6 Start watching this issue

Created:: 2022/04/22 3:37 PM

Updated:: 2023/10/19 12:19 PM

Resolved:: 2023/10/19 12:17 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates