Loading...

Type: Feature
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:

Work Type:
BU Product Work
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Hierarchy Progress Bar:

100% To Do, 0% In Progress, 0% Done

Risk Score:
0

Discussion Needed:

Program Call

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Intelligence Requested:
Market:

Feature Overview

Create a main documentation section for the control plane that consolidates the information that’s now spread out across multiple sections and articles, so that users can find find all the required information from one landing page, similarly to the Nodes section.

Problem to solve

The Control Plane team regularly gets the same queries about etcd latency, stretched clusters recommendations, performance, which aren't clear in the existing documentation.

With the increased popularity and demand for multi-site topologies to support OpenShift Virtualization, and the 4/5-node control plane support, more questions related to this are arising from the field, and we must provide a clear answer to support the field in the architectural and specification decisions.

Existing documentation

4/5-node control plane (current section: Scalability and Performance)

Recommended etcd practices (current section: Scalability and Performance)

Optimizing storage (current section: Scalability and Performance)

etcd tasks (current section: Postinstallation Configuration)

Backing up etcd (current section: Backup and Restore)

Encrypting etcd data

Articles

These documents contain some of the most common questions:

What's the latency tolerated by etcd nodes?
Can I use stretched clusters?
How do I use multiple sites?
What's the impact on the API server?

We must include this information clearly in the downstream documentation:

In them, we cover crucial information like the following in these articles, that's not available in the downstream documentation:

The combined disk and network latency and jitter must maintain an etcd peer round trip time of less than 100ms. This is NOT the same as the network round trip time. See the ETCD timers in OpenShift section below.
Layered products (e.g., storage providers) may have lower latency requirements. In those cases, the latency limits are dictated by the requirements of the architecture supported by the layered product. For example, OpenShift cluster deployments that ‘span’ multiple data centers with Red Hat OpenShift Data Foundation must have a latency requirement of less than 10ms RTT. For those cases, follow the specific product guidance.The combined disk and network latency and jitter must maintain an etcd peer round trip time of less than 100ms. This is NOT the same as the network round trip time. See the ETCD timers in OpenShift section below.

Layered products (e.g., storage providers) may have lower latency requirements. In those cases, the latency limits are dictated by the requirements of the architecture supported by the layered product. For example, OpenShift cluster deployments that ‘span’ multiple data centers with Red Hat OpenShift Data Foundation must have a latency requirement of less than 10ms RTT. For those cases, follow the specific product guidance.

A low latency network (with less than 2ms of latency, with v3 and less than 10 ms of RTT latency, with v4) between instances (systems) across all sites.

This requirement is driven by etcd, and is needed to ensure stability and quorum (no loss of leaders).

A high bandwidth network (with at least 5-10 Gbps capabilities) is needed

The value of the heartbeat interval should be around the maximum of the average round-trip time (RTT) between members, normally around 1.5x the round-trip time. With the OpenShift Container Platform default heartbeat interval of 100ms, the recommended RTT between control plane nodes is to be less than ~33ms with a maximum of less than 66ms (66ms x 1.5 = 99ms).

a network with a maximum latency of 80ms and jitter of 30ms will experience latencies of 110ms, which means etcd will be missing heartbeats, causing request timeouts and temporary leader loss.

99th percentile of the fsync is greater than the recommended value which is 20 ms, faster disks are recommended to host etcd for better performance

Use Prometheus to track the metric. histogram_quantile(0.99, rate(etcd_network_peer_round_trip_time_seconds_bucket[2m])) reports the round trip time for etcd to finish replicating the client requests between the members; it should be less than 50 ms.

Applying a request should normally take fewer than 50 milliseconds. If the average apply duration exceeds 200 milliseconds, etcd will warn that entries are taking too long to apply (took too long messages in the logs).

These data points need to be clear and concise in the downstream documentation and easily found in a section dedicated to the control plane as they are critical for running a healthy and stable cluster.

In the following blog post, with profiles we have an even higher latency tolerance:

https://www.redhat.com/en/blog/introducing-selectable-profiles-for-etcd

We need to reconcile all this information and consolidate it as the field keeps catching up with this information, which can be confusing.

causes

RFE-5327 Adjust thresholds for `etcdHighFsyncDurations` and `etcdHighCommitDurations` alerts

Under Review

Details

Description

Feature Overview

Problem to solve

Existing documentation

Articles

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates