-
Epic
-
Resolution: Obsolete
-
Undefined
-
None
-
None
-
None
-
Stability - Minimize disruption from etcd leader elections
-
To Do
-
Quality / Stability / Reliability
-
False
-
None
-
False
-
None
-
None
-
None
Epic Goal
- Make leader election impacts less disruptive to clusters
Why is this important?
Leader elections cause disruption as there is technically no leader and we cannot serve linearized transactions.
Scenarios
- ...
Acceptance Criteria
- CI - MUST be running successfully with tests automated
- Release Technical Enablement - Provide necessary release enablement details and documents.
- ...
Dependencies (internal and external)
- N/A
Previous Work (Optional):
https://bugzilla.redhat.com/show_bug.cgi?id=1870274
Open questions:
1.) Does the leader election itself result in additional I/O latencies? Do we replay wal file for example
2.) Since we just told all clients to retry does this retry all disrupt etcds ability to respond timely?
3.) Do all clients actually fail then retry? If the client chooses to just timeout it should be explored why and if it can be changed.
4.) how much time does a leader election take before we can serve requests on average.
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
- DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
- DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Downstream documentation merged: <link to meaningful PR>
- is caused by
-
OCPPLAN-7577 minimize disruption from etcd leader elections
-
- New
-