-
Feature Request
-
Resolution: Done
-
Undefined
-
None
-
None
-
False
-
None
-
False
-
Not Selected
1. Proposed title of this feature request
Disks isolation for ETCD
2. What is the nature and description of the request?
Currently etcd is making use of the hostpath at /var/lib/etcd to store and read data.
ETCD is known to be sensitive to disk latency and requires to persist proposals on the WAL incredibly quickly (unaligned write + fsync) to not block writes in the cluster.
It would be nice to have the ability for a customer to use a dedicated other block device for said etcd hostpath mount.
Customers seem usually unhappy with the performance of etcd and we'd advise them to use a better/faster disk in most cases - as sometimes the cluster outgrows the initially anticipated disk capacity in IOPS or bandwidth. This kind of performance problem can not be fixed by disk isolation, but can help with Day2 ops to add a new and better device to etcd alone without much hassle.
Advising to use faster hardware causes the customer to pay for more expensive and potentially bigger root disks, whereas etcd only needs a tiny fraction of that space (8gigs + WAL and headroom).
Chatty Control Plane components that heavily write logs also negatively impact the disk performance.
Recently, with the introduction of OVN that uses a similar distributed database system, etcd is in competition on the same resources (disk bandwidth, cpu etc). We've seen much of that in SDN-2880 already while enabling OVN in the CI.
Another recent failure in CI is when using SNO, which runs the image registry off of the same disk as etcd and caused high amount of failures.
3. Why does the customer need this? (List the business requirements here)
Reliability and more importantly, predictable performance. For RH less support requests.
4. List any affected packages or components.
- Installer
- Cloud Infra
- MCO
- etcd
cc dwest@redhat.com rhn-coreos-htariq rhn-support-pducai joelspeed
cc rhn-support-dhardie and heheffne@redhat.com this might also be something nice for CFE
- is duplicated by
-
OCPSTRAT-1406 [internal] Investigate a default dedicated partition for etcd
- Closed
- is related to
-
OCPSTRAT-1592 Support for Configuring Additional Disks During OpenShift Installation - Phase I
- New