Loading...

XML

Word

Printable

Type: Bug
Resolution: Done-Errata
Priority: Undefined
Fix Version/s: 4.19.0
Affects Version/s: 4.18.0, 4.19.0
Component/s: kube-apiserver
Labels:
- rits-work

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
None
Regression:
Yes

Target Backport Versions:

4.17.z, 4.16.z, 4.18.0, 4.18.z
Target Version:

4.19.0
Release Blocker:
Approved
Sprint:
None

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Release Note Status:
In Progress
Release Note Type:
Release Note Not Required
Release Note Text:
N/A

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Our carry patch intended to retry retriable requests that fail due to leader change will retry any etcd error with code "Unavailable": https://github.com/openshift/kubernetes/blob/4b2db1ec33faa3ffc305e5ffa7376908cc955370/staging/src/k8s.io/apiserver/pkg/storage/etcd3/etcd3retry/retry_etcdclient.go#L135-L145, but this includes reasons like "timeout" and does not distinguish between writes and reads. So a "timeout" error on a writing request might be retried even though a "timeout" observed by a client does not indicate that the effect of the write has not been persisted.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:

    1.
    2.
    3.

Actual results:

Expected results:

Additional info:

is cloned by

OCPBUGS-49841 [release-4.18] Etcd client can unsafely retry timeouts on mutating requests

Closed

OCPBUGS-49844 [openshift-apiserver] Etcd client can unsafely retry timeouts on mutating requests

Closed

OCPBUGS-49845 [oauth-apiserver] Etcd client can unsafely retry timeouts on mutating requests

Closed

is depended on by

OCPBUGS-49841 [release-4.18] Etcd client can unsafely retry timeouts on mutating requests

Closed

is related to

SRVKP-10877 FBC on push build keep running one and a half hours and can't finish

Closed

links to

openshift/kubernetes#2191: OCPBUGS-48694: Don't retry etcd write errors.

RHEA-2024:11038 OpenShift Container Platform 4.19.z bug fix update

(2 links to)

Assignee:: Ben Luddy

Reporter:: Ben Luddy

QA Contact:: Ke Wang

Need Info From:: None

Votes:: 0 Vote for this issue

Watchers:: 11 Start watching this issue

Created:: 2025/01/21 7:00 PM

Updated:: 2026/02/17 11:51 AM

Resolved:: 2025/06/17 4:53 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates