-
Feature Request
-
Resolution: Done
-
Normal
-
None
-
None
-
False
-
None
-
False
-
Not Selected
-
-
1. Proposed title of this feature request
2. What is the nature and description of the request?
OCP4 does not have a way to tune the etc parameters like timeout, heartbeat intervals, etc. Adjusting these parameters indiscriminately may compromise the stability of the control plane. In scenarios where disk IOPS are not ideal (e.g. disk degradation, storage providers in Cloud environments) this parameters could be adjusted to improve stability of the control plane while raising the corresponding warning notifications.
There has been past workarounds required as "one off" for Cloud providers (https://github.com/openshift/machine-config-operator/pull/1507) (https://github.com/openshift/cluster-etcd-operator/pull/218) to tune these parameters. There have been requests from community for tuning these:
(https://github.com/openshift/cluster-etcd-operator/pull/515) (https://github.com/openshift/cluster-etcd-operator/issues/499)
The current default values on a 4.10 deployment
```
- name: ETCD_ELECTION_TIMEOUT
value: "1000" - name: ETCD_ENABLE_PPROF
value: "true" - name: ETCD_EXPERIMENTAL_MAX_LEARNERS
value: "3" - name: ETCD_EXPERIMENTAL_WARNING_APPLY_DURATION
value: 200ms - name: ETCD_EXPERIMENTAL_WATCH_PROGRESS_NOTIFY_INTERVAL
value: 5s - name: ETCD_HEARTBEAT_INTERVAL
value: "100"
```
and the guidance for latency among control plane nodes do not translate well to on-premise live scenarios
https://access.redhat.com/articles/3220991
To address this, the RFE is for etcd to auto-tune the parameters based on actual conditions observed by the metrics of etcd and take into account disk latency, network latency/packet loss, and others considered as best practices for the proper tune of etcd.
3. Why does the customer need this? (List the business requirements here)
See above.
4. List any affected packages or components.
etcd
- relates to
-
OCPSTRAT-342 [etcd-operator] etcd timers selectable profiles (TechPreview)
- Closed