Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-21834

higher latency in pod to pod dataplane testing with OVN-IC

XMLWordPrintable

    • No
    • SDN Sprint 244, SDN Sprint 245
    • 2
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      We are running dataplane tests using netperf tool to measure latency during pod to pod communication. We observed latency of 100 usec with 4.13 OVN vs 190 usec in 4.14 OVN-IC. Client and server pods are hosted in different host in same AWS zone.
      We can find more results here https://docs.google.com/spreadsheets/d/1TEx29Vn2L20bYp7sXgW1tVA7NpK16rKxLOjpnBLr0qo/edit?usp=sharing
      

      Version-Release number of selected component (if applicable):

      OCP 4.14 4.14.0-0.nightly-2023-10-06-234925
      OCP 4.13 4.13.0-0.nightly-2023-10-08-090900
      

      How reproducible:

      Always

      Steps to Reproduce:

      1. Deploy self-managed aws cluster (2 worker nodes sufficient)
      2. run netperf using perf team's e2e script
      a) git clone https://github.com/cloud-bulldozer/e2e-benchmarking
      b) cd e2e-benchmarking/workloads/network-perf-v2/
      c) workload=full-run.yaml ./run.sh
       

      Actual results:

      Double the latency seen with OVN-IC 

      Expected results:

      10% deviation in latency is acceptable

      Additional info:

      Same configuration is used in both the environments i.e
      1. worker count - 24 (2 workers sufficient for this testing. This test makes sure that pods are scheduled on different worker nodes before it runs uperf tool)
      2. master and worker instance types - m5.2xlarge

              jcaamano@redhat.com Jaime CaamaƱo Ruiz
              vkommadi@redhat.com VENKATA ANIL kumar KOMMADDI
              Anurag Saxena Anurag Saxena
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: