Loading...

XML

Word

Printable

Type: Epic
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: None
Labels:
None

Epic Name:
Enable upstream flag to improve availability during defrag
Work Type:
BU Product Work
Blocked:
False
Blocked Reason:
None
Ready:
False
Color Status:
Not Selected
Epic Status:
To Do
Feature Link:
OCPSTRAT-319 - Hitless automatic defrag of etcd
Parent Link:
OCPSTRAT-319Hitless automatic defrag of etcd
Hierarchy Progress Bar:

80% To Do, 20% In Progress, 0% Done

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Intelligence Requested:
Market:

Epic Goal*

Enable the `--experimental-stop-grpc-service-on-defrag` flag on etcd to skip request sent to an etcd member that is undergoing defragmentation.

After enabling the flag, we will want to have the perfscale team validate the etcd performance for a cluster with a large number of API requests during defragmentation.

Why is this important? (mandatory)

See https://issues.redhat.com/browse/OCPSTRAT-319 for background, but generally speaking this can help improve API availability on large clusters where a leader undergoing defragmentation won't serve client requests.

Also see: https://github.com/kubernetes/kubernetes/issues/93280

Scenarios (mandatory)

The `--experimental-stop-grpc-service-on-defrag` should be enabled on all etcd members running in the cluster.
There should not be a degradation in API availability during defragmentation of etcd members.

Depending on the perf scale results we may want to consider first enabling this flag by default on on techpreview clusters.

Dependencies (internal and external) (mandatory)

We will need to coordinate a story with the perfscale team to validate etcd performance with this flag on.

Contributing Teams(and contacts) (mandatory)

Our expectation is that teams would modify the list below to fit the epic. Some epics may not need all the default groups but what is included here should accurately reflect who will be involved in delivering the epic.

Development - etcd team
Documentation - etcd docs
QE - etcd QE
PX -
Others -

Acceptance Criteria (optional)

Provide some (testable) examples of how we will know if we have achieved the epic goal.

Drawbacks or Risk (optional)

Reasons we should consider NOT doing this such as: limited audience for the feature, feature will be superseded by other work that is planned, resulting feature will introduce substantial administrative complexity or user confusion, etc.

Done - Checklist (mandatory)

The following points apply to all epics and are what the OpenShift team believes are the minimum set of criteria that epics should meet for us to consider them potentially shippable. We request that epic owners modify this list to reflect the work to be completed in order to produce something that is potentially shippable.

CI Testing - Basic e2e automationTests are merged and completing successfully
Documentation - Content development is complete.
QE - Test scenarios are written and executed successfully.
Technical Enablement - Slides are complete (if requested by PLM)
Engineering Stories Merged
All associated work items with the Epic are closed
Epic status should be “Release Pending”

is blocked by

ETCD-616 Rebase openshift/etcd to 3.5.14

Closed

links to

Upstream Client Side Changes

Assignee:: Mustafa Elbehery

Reporter:: Haseeb Tariq

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2024/06/05 5:55 AM

Updated:: 2025/03/03 4:26 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates