Loading...

XML

Word

Printable

Type: Feature
Resolution: Unresolved
Priority: Critical
Fix Version/s: None
Affects Version/s: None
Component/s: Two Node Fencing
Labels:

Activity Type:
Product / Portfolio Work
Parent Link:
OCPSTRAT-1542Two Node OpenShift topologies for edge customers
Hierarchy Progress Bar:

13% To Do, 69% In Progress, 19% Done
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Size:
XL

Target Version:

openshift-4.22
Release Blocker:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
None
PX Priority Data:
None
PX Impact Score:
PX Technical Impact:
None
PX Impact Range:
None
PX Scheduling Request:
None
PX Technical Impact Notes:
None

Intelligence Requested:
Market:

Feature Overview (aka. Goal Summary)

Customers with large numbers of geographically dispersed locations want a container management solution with a two node footprint. They require high availability but even "cheap" third nodes represent a significant cost at this scale.

Goals (aka. expected user outcomes)

Two-node clustering is a solved problem in the traditional HA space. The goal of this feature is to introduce existing RHEL technologies into OpenShift to support a true two-node topology. This requires fencing to ensure node recovery. Hence the name: Two Node OpenShift with Fencing (TNF).

Requirements (aka. Acceptance Criteria):

Provide a true two node OCP deployment
Support workload in active/passive mode..i.e.. single instance pod where the pods from the failed node are restarted on the 2nd node in a timely manner, or a 2nd pod is already running but passive, ready to take over if the 1st pod fails (e.g.: psql database in an active/passive setup). This sees CPU utilisation ~50% max.
Support workload in active/active workload. Both nodes are load sharing and they are loaded by design to be about 60-75% at full capacity - during failure there is an expectation of service degradation but not service down completely - So if one node fails the other node operates at close to 100%
Both nodes have a fencing device. BMC via redfish/IPMI on GA only. Other fenciing devices, e.g. power switches controlable via serial port might be added later or on demand. See also requirement #11
<60s failover time: if the leading node goes down, the remaining nodes takes over and gains operational state (writable) in less then 60s. Exact parameters (heartbeat interval, missed heartbeats etc. needs to be configurable by users, e.g. to operate on a less aggressive timeline if required (avoid unnecessary failovers due to blip/flukes)..
No shared storage available between nodes required.
Be able to scale out to a true three node compact cluster as day2 operation. (Stretch goal, not required for MVP, but constraint to be kept in mind during design and implementation). The resulting cluster should have 3 node etcd quorum, and the same architecture/support statement as a freshly installed 3 node compact cluster
Be able to add worker nodes to a two node cluster with fencing as day2 operation. Like we do support with SNO+worker nodes (stretch goal, no required for MVP)
Solution fullfills the[ k8s-etcd contract|https://docs.google.com/document/d/1NUZDiJeiIH5vo_FMaTWf0JtrQKCx0kpEaIIuPoj9P6A/edit#heading=h.tlkin1a8b8bl], so that layer mechanism like Leases work correctly.
support full recovery of the workload when the node comes back online after restoration - total time <15 mins
Not all edge devices can have a fencing devices. For those situations, a configuration with a dedicated direct cross over cable between the nodes can be setup. This drastically reduces the risk of split brain situations. In this configuration, if a node does not react to a ping via the direct cross over cable, it is considered to be fenced successfully. --> Removed from the scope, see comments below for reasons.

Deployment considerations	List applicable specific needs (N/A = not applicable)
Self-managed, managed, or both	Self managed
Classic (standalone cluster)	yes
Hosted control planes	n/a
Multi node, Compact (three node), or Single node (SNO), or all	NEW: Two Node with Fencing
Connected / Restricted Network	both
Architectures, e.g. x86_x64, ARM (aarch64), IBM Power (ppc64le), and IBM Z (s390x)	X86, arm
Operator compatibility	full
Backport needed (list applicable versions)	no
UI need (e.g. OpenShift Console, dynamic plugin, OCM)	none
Other (please specify)

Questions to Answer (Optional):

...

Out of Scope

Storage driver providing RWX shared storage
...

Background

Two node support is in high demand by telco, industrial and retail customers.
StarlingX supports true two node (docs)

Customer Considerations

Telco Customer requirements:

2-node HA control-plane requirements for Telco

Documentation Considerations

Topology needs to be documented, esp. The requirements of the arbiter node.

Interoperability Considerations

OCP Virt needs to be explicitly tested on this scenario to support VM HA (live migration, restart on other node)

clones

OCPSTRAT-2002 Two Node OpenShift with Fencing (TNF) - Tech Preview

Closed

links to

https://github.com/coreos/fedora-coreos-config/pull/3461

medik8s/node-healthcheck-operator#353: Check control plane topology

Merge request - Remove creation of CRI-O agent of type systemctl

Merge request - Updated US source to: 6efed12 Merge pull request #353 from slintes/skip-tno

Merge request - Updated US source to: 6efed12 Merge pull request #353 from slintes/skip-tno

Merge request - Updated US source to: 90aa49d Merge pull request #352 from razo7/update-golang-1.23

Merge request - Updated US source to: 90aa49d Merge pull request #352 from razo7/update-golang-1.23

Merge request - Updated US source to: 90aa49d Merge pull request #352 from razo7/update-golang-1.23

openshift-eng/ci-test-mapping#433: OCPEDGE-2090: upkeep: add mapping for tnf test rename

openshift-eng/two-node-toolbox#19: add fencing validator script

openshift/api#2544: OCPEDGE-2084: Add PacemakerStatus CRD for two-node fencing

openshift/assisted-service#8057: OCPEDGE-2217: feat: add two node fencing support to abi flow

openshift/assisted-service#8457: OCPEDGE-2217: Add TNF ABI workflow to existing assisted installer flow

openshift/cluster-etcd-operator#1481: OCPEDGE-2088, OCPEDGE-1885: Updated state transitions & tests for TNF setup job

openshift/cluster-etcd-operator#1484: OCPEDGE-2033: [TNF] Use operator condition to track etcd container removal in ExternalEtcd clusters

openshift/cluster-etcd-operator#1485: [WIP] OCPEDGE-2175: embed fencing validator into etcd scripts for TNF clusters (DualReplica FG)

openshift/cluster-etcd-operator#1486: [WIP] OCPEDGE-2176: disruptive fencing validation TNF

openshift/cluster-etcd-operator#1487: WIP: OCPEDGE-2084: Add pacemaker health check for ExternalEtcd clusters

openshift/console#14846: OCPEDGE-1614: feat: bump api to pull new topology mode

openshift/console#14933: OCPEDGE-1614: fix: soften topology validations from preventing console deployment

openshift/console#14934: OCPEDGE-1614: hack: allow DualReplica in control plane topology validation

openshift/console#15024: OCPBUGS-55629, OCPEDGE-1614, OCPBUGS-55037: get `go mod tidy` to pass, bump Console Helm dependencies to 3.17.3

openshift/continuous-release-jobs#1643: OCPEDGE-2009 - Create alerting for Slack #ci-two-node-fencing

openshift/enhancements#1887: OCPEDGE-2215: Updated TNF EP to address some drift from original requirements.

openshift/installer#9872: OCPEDGE-2063: upkeep: update fencing unit tests

openshift/installer#9946: OCPEDGE-1517: add-tnf-agent-based-installer

openshift/installer#9949: OCPEDGE-1517: Bump assisted-service API version

openshift/machine-config-operator#5285: [WIP] OCPEDGE-2188: embed fencing validator into TNF MCO

openshift/origin#29996: OCPEDGE-1983: [TNF] Fix test suite filtering to include Disrupted tagged tests

openshift/origin#30110: OCPEDGE-1483: Add TNF E2E tests for network failure

openshift/origin#30218: OCPEDGE-1916: fix: add exception for two node fencing

openshift/origin#30252: OCPEDGE-2090: Add validation tests for tnf topology effects

openshift/origin#30279: OCPEDGE-1788 - add cold boot tests

openshift/origin#30290: OCPEDGE-1484 kubelet disruption test

openshift/origin#30298: WIP: OCPEDGE-1486 CP Node replacement test for TNF

openshift/origin#30320: OCPEDGE-1485: Add initial version of etcd kill and recovery test

openshift/origin#30370: OCPEDGE-1565: [TNF] Add double node failure recovery test

openshift/origin#30404: OCPEDGE-1788: TNF add etcd cold boot recovery tests from graceful node shutdown

openshift/origin#30494: OCPEDGE-1486: Added TNF control-plane node replacement test.

openshift/origin#30519: OCPEDGE-1788: TNF add etcd cold boot recovery tests from graceful node shutdown

openshift/os#1796: OCPEDGE-1704: [TNF] Added symlink to allow for custom pacemaker resource agent in two-node OCP variant

openshift/os#1802: OCPEDGE-1752: [TNF] Remove broken symlink test skip once fedora-coreos-config is bumped.

openshift/os#1826: OCPEDGE-1706: [TNF] OCP Two Node with Fencing symlink begone!

openshift/os#1827: [release-4.19] OCPEDGE-1706: [TNF] OCP Two Node with Fencing symlink begone!

openshift/release#67474: [DRAFT] OCPEDGE-1473 - Upgrade testing for 2NF (between y-streams)

openshift/release#68503: OCPEDGE-2085 - adding ipv6 and dualstack tests

openshift/release#68630: OCPEDGE-2113 - adding dualstack CI Lane

openshift/release#68982: [WIP] OCPEDGE-1822: Add e2e-metal-ovn-two-node-fencing-degraded job for testing degraded node scenarios

openshift/release#69565: OCPEDGE-1519: feat: add tnf abi presubmit to installer

openshift/release#70500: OCPEDGE-1822: Add e2e-metal-ovn-two-node-fencing-degraded job for testing degraded node scenarios

Two Node Fencing Requirements

(48 links to)

Assignee:: Daniel Fröhlich

Reporter:: Daniel Fröhlich

Need Info From:: None

Contributors:: Andrew Beekhof, Michael Shitrit, William Caban

Architect:: Geri Peterson

QA Contact:: John George

Doc Contact:: Matthew Werner

Product Operations Engineering Contact:: Derrick Ornelas

SME:: Jeremy Poulin

Votes:: 2 Vote for this issue

Watchers:: 19 Start watching this issue

Created:: 2024/07/31 1:20 PM

Updated:: 2025/11/27 2:35 PM