-
Epic
-
Resolution: Done
-
Critical
-
None
-
Implement OpenShift recommended leader election settings (Lease only)
-
False
-
None
-
False
-
Green
-
To Do
-
ACM-1136 - HoH multicluster globalhub
-
0% To Do, 0% In Progress, 100% Done
Epic Goal
- ALL controllers will move to implementing the Lease only
- Evaluate how to provide best practices for leader election
Why is this important?
- Some ACM components update resource locks so frequently for leader election that causes heavy load on etcd.
Based on the discussion in architecture forum meeting on Apr 20th, 2022, it's necessary to evaluate the OpenShift recommended leader election settings (LeaseDuration=137s, RenewDealine=107s, RetryPeriod=26s) before adopting it on ACM components.
- Bugzilla 2069741 Heavy load on etcd by ACM service account #21247
- https://github.com/openshift/enhancements/blob/master/CONVENTIONS.md#high-availability
- https://docs.google.com/presentation/d/1VZuHIa3Zu6djV6zc38Scar1WkwfOJfEg3-2tTiTQhYM/edit?usp=sharing
- https://github.com/stolostron/backlog/issues/21834
Scenarios
- Each Squad Lead will be required to open a Story under the Epic that will track the work for their squad - use title "<squad name> leader election changes"
- Under each squad’s story, the Squad Lead will open a Task for each controller that needs to adopt the leader election settings.
Acceptance Criteria
- CI - MUST be running successfully with tests automated
- Release Technical Enablement - Provide necessary release enablement details and documents.
- ...
Dependencies (internal and external)
- We need a story or task for each squad...
Previous Work (Optional):
- Policy is done.
Open questions::
- …
Done Checklist
- CI - CI is running, tests are automated and merged.
- Release Enablement <link to Feature Enablement Presentation>
- DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
- DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
- DEV - Downstream build attached to advisory: <link to errata>
- QE - Test plans in Polarion: <link or reference to Polarion>
- QE - Automated tests merged: <link or reference to automated tests>
- DOC - Downstream documentation merged: <link to meaningful PR>