[OCPBUGS-51971] etcd compaction can become blocking when it shouldn't in 3.5.16+

Type: Bug
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: 4.17.z, 4.18.z, 4.19.0
Component/s: Etcd
Labels:
None

Regression:
None
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Release Note Text:
A change that went into etcd 3.5.16 can lead to etcd compaction becoming blocking whenever it takes more than 10ms to process a batch. We are correcting that regression in this update.
Release Note Type:
Bug Fix
Release Note Status:
Proposed
Target Version:

4.18.z
Target Backport Versions:

4.17.z

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Whenever etcd compaction batches take more than 10ms to perform compaction becomes request blocking. This upstream PR provides more detail

https://github.com/etcd-io/etcd/pull/19405

While investigating the kube apiserver request latency spike in 4.19 we have determined that RHEL 9.6 based host OS seems to yield a 10-20x increase in compaction duration. With this PR the impact on apiserver request latency is reduced by approximately 50% yet compaction duration remains high. We're trying to chase that with the kernel team independently.

Given that 4.17 and later are on 3.5.16+ it would be good to backport this fix through 4.17.

blocks

OCPBUGS-53447 etcd compaction can become blocking when it shouldn't in 3.5.16+

Closed

clones

OCPBUGS-51838 etcd compaction can become blocking when it shouldn't in 3.5.16+

Closed

depends on

OCPBUGS-51838 etcd compaction can become blocking when it shouldn't in 3.5.16+

Closed

is cloned by

OCPBUGS-53447 etcd compaction can become blocking when it shouldn't in 3.5.16+

Closed

links to

openshift/etcd#312: DOWNSTREAM: <carry>: OCPBUGS-51971: fix a compaction induce latency issue

RHBA-2025:2705 OpenShift Container Platform 4.18.z bug fix update

(1 links to)

Errata Tool added a comment - 2025/03/18 2:17 AM

Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

For information on the advisory (Important: OpenShift Container Platform 4.18.5 bug fix and security update), and where to find the updated files, follow the link below.

If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHSA-2025:2705

Errata Tool added a comment - 2025/03/18 2:17 AM Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (Important: OpenShift Container Platform 4.18.5 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2025:2705

OpenShift Jira Bot added a comment - 2025/02/27 4:56 PM

Moved to Proposed

OpenShift Jira Bot added a comment - 2025/02/27 4:56 PM Moved to Proposed

Assignee:: Dean West

Reporter:: Scott Dodson

QA Contact:: Ge Liu

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2025/02/27 4:56 PM

Updated:: 2025/03/21 1:14 PM

Resolved:: 2025/03/14 8:04 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

Collapse comment: Errata Tool added a comment - 2025/03/18 2:17 AM

Expand comment: Errata Tool added a comment - 2025/03/18 2:17 AM

Collapse comment: OpenShift Jira Bot added a comment - 2025/02/27 4:56 PM

Expand comment: OpenShift Jira Bot added a comment - 2025/02/27 4:56 PM

People

Dates