Uploaded image for project: 'OpenShift Etcd'
  1. OpenShift Etcd
  2. ETCD-456

Etcd Tuning Parameters

XMLWordPrintable

    • Icon: Epic Epic
    • Resolution: Done
    • Icon: Major Major
    • openshift-4.14
    • None
    • None
    • Etcd Tuning Parameters
    • False
    • None
    • False
    • Not Selected
    • To Do
    • 0% To Do, 0% In Progress, 100% Done

      Epic Goal*

      Provide a way to tune the etcd latency parameters ETCD_HEARTBEAT_INTERVAL and ETCD_ELECTION_TIMEOUT.

       
      Why is this important? (mandatory)

      OCP4 does not have a way to tune the etc parameters like timeout, heartbeat intervals, etc. Adjusting these parameters indiscriminately may compromise the stability of the control plane. In scenarios where disk IOPS are not ideal (e.g. disk degradation, storage providers in Cloud environments) this parameters could be adjusted to improve stability of the control plane while raising the corresponding warning notifications.

      In the past:

      The current default values on a 4.10 deployment
      ```
      name: ETCD_ELECTION_TIMEOUT
      value: "1000"
      name: ETCD_ENABLE_PPROF
      value: "true"
      name: ETCD_EXPERIMENTAL_MAX_LEARNERS
      value: "3"
      name: ETCD_EXPERIMENTAL_WARNING_APPLY_DURATION
      value: 200ms
      name: ETCD_EXPERIMENTAL_WATCH_PROGRESS_NOTIFY_INTERVAL
      value: 5s
      name: ETCD_HEARTBEAT_INTERVAL
      value: "100"
      ```
      and these are modified for exceptions of specific cloud providers (https://github.com/openshift/cluster-etcd-operator/blob/master/pkg/etcdenvvar/etcd_env.go#L232-L254).

      The guidance for latency among control plane nodes do not translate well to on-premise live scenarios https://access.redhat.com/articles/3220991

       
      Scenarios (mandatory) 

      Defining etcd-operator API to provide the cluster-admin the ability to set `ETCD_ELECTION_TIMEOUT` and `ETCD_HEARTBEAT_INTERVAL` within certain range.

       
      Dependencies (internal and external) (mandatory)

      No external teams

      Contributing Teams(and contacts) (mandatory) 

      Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.

      • Development - etcd team
      • Documentation -
      • QE - 
      • PX - 
      • Others -

      Acceptance Criteria (optional)

      Provide some (testable) examples of how we will know if we have achieved the epic goal.  

      Drawbacks or Risk (optional)

      Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.

      Done - Checklist (mandatory)

      The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.

      • CI Testing -  Basic e2e automationTests are merged and completing successfully
      • Documentation - Content development is complete.
      • QE - Test scenarios are written and executed successfully.
      • Technical Enablement - Slides are complete (if requested by PLM)
      • Engineering Stories Merged
      • All associated work items with the Epic are closed
      • Epic status should be “Release Pending” 

            alray@redhat.com Allen Ray
            rhn-coreos-htariq Haseeb Tariq
            ge liu ge liu
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: