Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-1252

[Investigation spike] Localnet attachment type produces significantly higher TCP retransmits

    • 5
    • False
    • Hide

      None

      Show
      None
    • False
    • Hide

      Given the issue described in FDP-1232,

      When an engineer investigates and analyzes the logs,

      Then a comment should be added to this Jira ticket summarizing the root cause of these errors and potential solutions for the problem.

      Definition of Done

      • Spend up to 2 hours investigating and analyzing the logs provided
      • Write a comment in this ticket summarizing the root cause of the issue and suggesting fix or follow-up actions
      Show
      Given the issue described in FDP-1232 , When an engineer investigates and analyzes the logs, Then a comment should be added to this Jira ticket summarizing the root cause of these errors and potential solutions for the problem. Definition of Done Spend up to 2 hours investigating and analyzing the logs provided Write a comment in this ticket summarizing the root cause of the issue and suggesting fix or follow-up actions
    • None
    • rhel-net-ovs-dpdk

      This ticket is created automatically to track an investigation spike related to the issue: FDP-1232.

      The goal is to identify the root cause or clarify unknowns and evaluate possible solutions or workarounds for the problem described below.

      Description of problem:

      When measuring VM-to-VM TCP Stream throughput, localnet attachment types can achieve expected throughput, however localnet specifically shows significantly higher retransmit values compared to other net types.
      
      Note that this behavior is not specific to localnet on br-ex, it is observed when attached to another OvS bridge as well. 

      Version-Release number of selected component (if applicable):

       4.18.1 

      How reproducible:

      Multiple runs on compact cluster, will confirm on full worker cluster

      Steps to Reproduce:

      Full details in this perf report:
      https://docs.google.com/document/d/1pDOjROxCqrfNhE2nAkeJaRen3RtfixeJdt82ae_aW20/edit?usp=sharing

      Actual results:

       TCP retransmit values should be somewhat similar to other net types (up to ~40K for test case) 

      Expected results:

       TCP retransmit values for localnet are ~100K-400K for test case 

      Additional info:

      Affected Platforms: baremetal
      
      Contact jhopper@redhat.com for direct cluster access, I will continue to debug and post updates here otherwise. 
      
       * traffic path: vm to vm  

              ovsdpdk-triage ovsdpdk triage
              rh-ee-sfaye Stanislas Faye
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: