Whenever etcd compaction batches take more than 10ms to perform compaction becomes request blocking. This upstream PR provides more detail
https://github.com/etcd-io/etcd/pull/19405
While investigating the kube apiserver request latency spike in 4.19 we have determined that RHEL 9.6 based host OS seems to yield a 10-20x increase in compaction duration. With this PR the impact on apiserver request latency is reduced by approximately 50% yet compaction duration remains high. We're trying to chase that with the kernel team independently.
Given that 4.17 and later are on 3.5.16+ it would be good to backport this fix through 4.17.
- blocks
-
OCPBUGS-53447 etcd compaction can become blocking when it shouldn't in 3.5.16+
-
- Closed
-
- clones
-
OCPBUGS-51838 etcd compaction can become blocking when it shouldn't in 3.5.16+
-
- Closed
-
- depends on
-
OCPBUGS-51838 etcd compaction can become blocking when it shouldn't in 3.5.16+
-
- Closed
-
- is cloned by
-
OCPBUGS-53447 etcd compaction can become blocking when it shouldn't in 3.5.16+
-
- Closed
-
- links to
-
RHBA-2025:2705 OpenShift Container Platform 4.18.z bug fix update