Loading...

XML

Word

Printable

Type: Bug
Resolution: Won't Do
Priority: Major
Fix Version/s: None
Affects Version/s: 4.12
Component/s: Networking / openshift-sdn
Labels:

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Story Points:
None
Severity:
Important
Regression:
No
Latest Status Summary:
18-March the remaining cu support case is waiting for verification that after the Nutanix upgrade the slow response time issues are not repeated. With a cu ack or no response this bug could be closed.

Target Backport Versions:
None
Target Version:
None
Release Blocker:
None
Sprint:
SDN Sprint 250
sprint_count:
1

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

PX Review Complete:
PX Priority Data:
PX Impact Score:
PX Technical Impact:
PX Impact Range:

Release Note Status:
None
Release Note Type:
None
Release Note Text:
None

Escape Reason:
None
Escape Impact:
None
Corrective Measures:
None
SDLC stage when should've been found:
None

Description of problem:

Customer is using OpenShiftSDN on 3 clusters installed on Nutanix. They are having low pod to pod bandwidth on the cluster network when the nodes hosting them are running in 2 different virtualization hosts, they are using regular MTU (1500 on the NIC, 1450 on the cluster network), we measured the bandwidth with iperf3
Even though the bandwidth measured on the host network is ~24 Gbit/s, the bandwidth is lower when we are measuring it on the cluster network, this has been tested with some daemonsets running an image with iperf3 sitting both on the cluster network and the host network.

- we were able to reach ~24 Gbit/s when the pods are running on the host network in 2 different virtualization hosts.
- when the pods are on the cluster network and running on 2 different nodes in 2 different virtualization hosts the bandwidth have 12 Mbit/s and takes a couple of seconds to ramp up.
- when the 2 pods are on the cluster network, 2 different nodes and the same virtualization hosts the bandwidth is ~6.5 Gbit/s.
- one of the cluster is a test one, we migrated it to OVN kubernetes and reached ~7 Gbit/s pod to pod, 2 different virtualization hosts.

Version-Release number of selected component (if applicable):

OCP 4.12.35

How reproducible:

Customer has 3 clusters affected, the issue is exactly the same on all of their clusters.

Steps to Reproduce:

Only in customer environment so far.

Actual results:

cluster network is having slow performance.

Expected results:

We are expecting that the cluster network is not that slow when compare to the bandwidth available at host network level

Additional info:

Assignee:: Ben Pickard

Reporter:: Francesco Cristini

Need Info From:: None

Contributors:: Chris Fields

QA Contact:: Zhanqi Zhao

Doc Contact:: None

Votes:: 0 Vote for this issue

Watchers:: 4 Start watching this issue

Created:: 2024/02/06 10:13 AM

Updated:: 2025/09/13 6:09 PM

Resolved:: 2024/03/27 2:53 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates