-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
4.19
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
OCP 4.19 cluster-density-v2 podReadyLatencies regressed significantly. 99th-percentile latencies were stable at 11s, and are now consistently above 15s, sometimes reaching ___ Max latencies sometimes fluctuate, but were also stable. Now sometimes reach 40s. The stats are: Before this change, 99th latency measurements were 11.064s +/- 0.235s , max: 12.676s +/ 2.056s After the change, 99th is 15.650s +/- 0.489s , max is 20.591s +/- 6.609s
Version-Release number of selected component (if applicable):
There are multiple change points between several versions. Hunter change point detection algorithm picks out * the max latency change was introduced between 4.19.0-0.nightly-2025-01-27-130640 and 4.19.0-0.nightly-2025-01-28-090833 * 99th percentile latency change was introduced between: 4.19.0-0.nightly-2025-01-28-090833 and 4.19.0-0.nightly-2025-01-30-091858.
How reproducible:
100% . These values have been consistently high and unstable to today (Feb 10).
Steps to Reproduce:
1. Run the payload control plane test in prow:`/pj-rehearse periodic-ci-openshift-qe-ocp-qe-perfscale-ci-main-aws-4.19-nightly-x86-payload-control-plane-6nodes` (or observe the current job history triggered on each nightly build)
Actual results:
Observe cluster-density-v2 p99 is >=15s and max is between 17-40s
Expected results:
cluster-density-v2 p99 is 11s and max is 12s
Additional info:
Our expectation of 11s 99th percentile is not a sensitive threshold. The increase from 11 to 15 indicates significant reduction in throughput throughout the platform, akin to a higher workload density or higher client QPS scaling rate. We should handle this as a perceptible difference in the user experience and cluster stability.
- is related to
-
CNV-63354 [Scale] Fusion Access Operator Storage Regression for CNV
-
- Closed
-