-
Bug
-
Resolution: Won't Do
-
Major
-
None
-
4.12
-
Important
-
No
-
SDN Sprint 250
-
1
-
False
-
-
18-March the remaining cu support case is waiting for verification that after the Nutanix upgrade the slow response time issues are not repeated. With a cu ack or no response this bug could be closed.
-
-
-
involves Nutanix CSI Operator; can close
-
-
-
Description of problem:
Customer is using OpenShiftSDN on 3 clusters installed on Nutanix. They are having low pod to pod bandwidth on the cluster network when the nodes hosting them are running in 2 different virtualization hosts, they are using regular MTU (1500 on the NIC, 1450 on the cluster network), we measured the bandwidth with iperf3 Even though the bandwidth measured on the host network is ~24 Gbit/s, the bandwidth is lower when we are measuring it on the cluster network, this has been tested with some daemonsets running an image with iperf3 sitting both on the cluster network and the host network. - we were able to reach ~24 Gbit/s when the pods are running on the host network in 2 different virtualization hosts. - when the pods are on the cluster network and running on 2 different nodes in 2 different virtualization hosts the bandwidth have 12 Mbit/s and takes a couple of seconds to ramp up. - when the 2 pods are on the cluster network, 2 different nodes and the same virtualization hosts the bandwidth is ~6.5 Gbit/s. - one of the cluster is a test one, we migrated it to OVN kubernetes and reached ~7 Gbit/s pod to pod, 2 different virtualization hosts.
Version-Release number of selected component (if applicable):
OCP 4.12.35
How reproducible:
Customer has 3 clusters affected, the issue is exactly the same on all of their clusters.
Steps to Reproduce:
Only in customer environment so far.
Actual results:
cluster network is having slow performance.
Expected results:
We are expecting that the cluster network is not that slow when compare to the bandwidth available at host network level
Additional info: