Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Won't Do
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:

Target Version:
None
Activity Type:
Product / Portfolio Work
Status Summary:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Products:
None
Hierarchy Progress Bar:
None
Portfolio Solutions:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Impact Score:
None
PX Impact Range:
None
PX Priority Data:
None
PX Technical Impact:
None
PX Technical Impact Notes:
None
PX Scheduling Request:
None

1. Proposed title of this feature request

Geo-Redundant deployment of multi-node OCP clusters

2. What is the nature and description of the request?

OCP clusters running in multiple sites / availability zones to provide Geo-Redundancy. In case of a site failure, another takes over.

The following requirements should be met:

Geo-Redundancy (GR) is needed to ensure there is service continuity (e.g. when a particular site goes down the partner site takes over)
The following GR models should be evaluated:
- Active-Active
- Active-Standby
- Stretched Cluster (cluster deployed across different DCs)
RPO (Recover Point Objective) must ensure there is no data loss.
RTO (Recover Time Objective) must be 5 minutes or less.
Redundancy concept should be transparent for applications and offload the pod.
Active Site and Partner Site will be commissioned individually. It is expected that all applications on the Active Site will be mirrored to the Partner Site, including any future updates of the applications.
At any given point in time, application data need to be in sync between both Sites. Data synchronization should take place between both system as both serve traffic.
Support for N:1 GR model, where N (max:2) is the number of Active Sites and 1 represents the Partner Site. At any given point in time, the Partner Site should be able to act as an Active Site:
- 1+1 - 2 Sites A/A or A/S
- 2+1 - 2 Sites A/A or A/S, 1 site available to replace one of 2 sites which handle commercial traffic
Load Balancing is required for seamless access to the both Sites.
GR solution to be supported on Bare Metal OCP clusters.
Switchover and Failover use cases should be supported. Switchover is when both Sites are running and the active role is moved from one to the other. Failover is when the Primary Site is fully down and irrecoverable and all traffic is directed to the Partner Site.
Auto-failover should be possible when Active Site goes down.
When a system recovers from failure, it should be synchronized with the system that acts as the Active Site before it is allowed to serve traffic again.
The GR solution should still work as expected even if the two Sites are on different OCP versions (minor version difference only):
- Different OCP Z-stream releases, e.g. Site 1: v.4.9.11, Site 2: v.4.9.20
- Different OCP Y-stream releases, e.g. Site 1: v.4.9.11, Site 2: v.4.10.4
Backup and Recovery for both Sites must be possible.

3. Why does the customer need this? (List the business requirements here)

Geo-Redundancy is required in cases where applications are running on multi-node OCP clusters, so that outages have less impact. The redundancy concept must be agnostic and transparent to the applications running on those nodes.

4. List any affected packages or components.

Red Hat OpenShift Container Platform

relates to

RFE-2800 Geo-Redundancy for SNO

Closed

Assignee:: Tushar Katarki

Reporter:: Demetris Vassiliades (Inactive)

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2022/04/26 8:29 AM

Updated:: 2025/07/07 1:16 PM

Resolved:: 2022/07/11 12:05 PM

Target start:: None

Target end:: None

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates