-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.20
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
We noticed an increase in the 95 percentile disruptions for pod-to-host and host-to-host backends for vsphere platform. Further look shows that the disruption comes from a few jobs testing host groups feature. Here is a dashboard showing the trend: https://grafana-loki.ci.openshift.org/d/ISnBj4LVk/disruption?var-platform=vsphere&var-percentile=P95&var-backend=host-to-host-new-connections&var-releases=4.20&var-upgrade_type=none&var-networks=ovn&var-topologies=ha&var-architectures=amd64&var-lookback=3&var-master_nodes_updated=N&var-min_disruption_regression=-10&var-min_disruption_job_list=0&var-min_relevance=0&var-featureset=techpreview&orgId=1 We can see a few jobs having disruptions. Here is one example job: periodic-ci-openshift-release-master-nightly-4.20-e2e-vsphere-host-groups-ovn-techpreview #1965131620608380928 We started a slack thread here: https://redhat-internal.slack.com/archives/C015H2WDJRY/p1757438092386989 In the thread, we learned that host groups is a layered deployment that low performance is expected. It was suggested disruption test should be disabled in this setup. This card is created to keep track of this issue. From whom should we get permission for disabling disruption test in this case? It is worth noting that most of the jobs do not show the same disruption.
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. 2. 3.
Actual results:
Expected results:
Additional info: