Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-2677

QE verification: Maximum queue size (rx/tx) limitation should be removed

    • False
    • Hide

      None

      Show
      None
    • False
    • Hide

      ( ) The bug has been reproduced and verified by QE members
      ( ) Test coverage has been added to downstream CI
      ( ) For new feature, failed test plans have bugs added as children to the epic
      ( ) The bug is cloned to any relevant release that we support and/or is needed

      Show
      ( ) The bug has been reproduced and verified by QE members ( ) Test coverage has been added to downstream CI ( ) For new feature, failed test plans have bugs added as children to the epic ( ) The bug is cloned to any relevant release that we support and/or is needed
    • rhel-9
    • None

      This ticket is tracking the QE verification effort for the solution to the problem described below.

       Problem Description: Clearly explain the issue.

      The customer reported that the maximum queue size ( `rxq_desc` ) on physical NICs used by OVS-DPDK is currently limited to 4096 in OVS3.3. On the other hand, from upstream openvswitch information, the limit on maximum queue size are removed.

      Detail to be found as follow: 
      https://github.com/openvswitch/ovs/commit/60906037033fd87913046b74b9f1962a74c04a7f

       Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).

      Bad performance and packet loss occurred due to queue size.

      This issue directly impacts tenant workloads running on the customer's OpenStack environment. 

       Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).

      • Openstack 17.1.x
      • OVS 3.3

          Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).

      New issue / feature limitation (not a regression)

       Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.

      The issue is consistently reproducible, repeatable.

       Reproduction Steps: Provide detailed steps or scripts to replicate the issue.

      NA

       Expected Behavior: Describe what should happen under normal circumstances.

      When the RX queue size is increased or set to unlimited, packet drops (`rx_missed_errors`) should be reduced or eliminated under the same workload.

      The actual queue size limitation depends on PMD driver. 

       Observed Behavior: Explain what actually happens.

      • `rx_missed_errors` continuously increases, matching the value of `rx_errors`.
      • Packet loss is visible to tenant traffic even when PMD threads are not fully utilized.
      • Increasing RX ring buffer to 8192 does not change behavior because OVS limits the queue size to 4096.

         Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.

      • Increased RX ring buffer to 8192 becomes 4096
      • Cross-checked with upstream commit : "netdev-dpdk: Remove limit on maximum descriptors count", which indicates this limitation has already been lifted by upstream. 

         Logs: If you collected logs please provide them (e.g. sos report, /var/log/openvswitch/* , testpmd console)

      NA

              ovsdpdk-triage ovsdpdk triage
              nstbot NST Bot
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: