Whenever etcd compaction batches take more than 10ms to perform compaction becomes request blocking. This upstream PR provides more detail
https://github.com/etcd-io/etcd/pull/19405
While investigating the kube apiserver request latency spike in 4.19 we have determined that RHEL 9.6 based host OS seems to yield a 10-20x increase in compaction duration. With this PR the impact on apiserver request latency is reduced by approximately 50% yet compaction duration remains high. We're trying to chase that with the kernel team independently.
Given that 4.17 and later are on 3.5.16+ it would be good to backport this fix through 4.17.
- is cloned by
-
OCPBUGS-51971 etcd compaction can become blocking when it shouldn't in 3.5.16+
-
- Closed
-
- is depended on by
-
OCPBUGS-51971 etcd compaction can become blocking when it shouldn't in 3.5.16+
-
- Closed
-
- links to