-
Sub-task
-
Resolution: Done-Errata
-
Undefined
-
None
-
None
-
None
-
0
-
False
-
-
False
-
openvswitch3.1-3.1.0-147.el9fdp
-
rhel-9
-
rhel-sst-network-fastdatapath-ovsdpdk
-
-
-
ssg_networking
-
OVS/DPDK - FDP-25.B
-
1
Problem Description: Clearly explain the issue.
OVS auto load balance evaluates variance improvement for the distribution of datapath processing cycles across the available pmd cores.
If a threshold is met, a rebalance will be triggered.
To help with debugging there are debug logs that display the current and estimated variance for a numa node. e.g.
dpif_netdev | DBG | Numa node 1. Current variance 1000 Estimated variance 0. Variance improvement 100%. |
It is seen that for subsequent Numa nodes the improvement variable is not reset, so if there is no improvement it is not set and remains as the value from the previous evaluated Numa node with a non-zero improvement e.g.
2025-02-11T16:57:58Z|00399|dpif_netdev|DBG|Numa node 1. Current variance 1000 Estimated variance 0. Variance improvement 100%.
2025-02-11T16:58:39Z|00400|dpif_netdev|DBG|Numa node 0. Current variance 0 Estimated variance 0. Variance improvement 100%. <--- incorrect log: variance should be 0%
This is a debug log issue only and not a functional issue as evaluation against the threshold for a rebalance will be done correctly from the initial Numa.
Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).
This only impacts debug logs when debugging OVS auto load balance issues.
Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).
All current supported versions contain this code.
Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).
New issue reported while OSP team were debugging some throughput and ALB tests.
Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.
Repeatably with the correct traffic profile and core layout so that first evalulated Numa node has a non-zero variance improvement, while the second evaluated Numa has a zero variance improvment.
Reproduction Steps: Provide detailed steps or scripts to replicate the issue.
Set traffic to have variance on one numa and none on the other, For example
pmd thread numa_id 0 core_id 8: isolated : false port: dpdk2 queue-id: 0 (enabled) pmd usage: 0 % port: dpdk3 queue-id: 1 (enabled) pmd usage: 0 % overhead: 0 % pmd thread numa_id 1 core_id 9: isolated : false port: myport queue-id: 0 (enabled) pmd usage: 20 % port: urport queue-id: 1 (enabled) pmd usage: 20 % overhead: 5 % pmd thread numa_id 0 core_id 10: isolated : false port: dpdk2 queue-id: 1 (enabled) pmd usage: 0 % port: dpdk3 queue-id: 0 (enabled) pmd usage: 0 % overhead: 0 % pmd thread numa_id 1 core_id 11: isolated : false port: myport queue-id: 1 (enabled) pmd usage: 49 % port: urport queue-id: 0 (enabled) pmd usage: 46 %
There is variance on Numa 1 and no variance on Numa 0.
Enable debug and auto load balance
ovs-appctl vlog/set dpif_netdev:dbg
ovs-vsctl set Open_vSwitch . other_config:pmd-auto-lb="true"
Wait 1 min for dry run debug logs. Observe that log for Numa 0 also reports Numa 1 variance improvement.
Expected Behavior: Describe what should happen under normal circumstances.
Logs should correctly report the variance improvement for each Numa
Observed Behavior: Explain what actually happens.
Logs do not correctly report the variance improvement for each Numa in some circumstance as per above
Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.
I have confirmed the issue is a loop variable not being reset. I tested a fix to reset the loop variable and it is working correctly.
Logs: If you collected logs please provide them (e.g. sos report, /var/log/openvswitch/* , testpmd console)
intrumented for some fake variance to show the issue:
2025-02-11T16:57:58Z|00399|dpif_netdev|DBG|Numa node 1. Current variance 1000 Estimated variance 0. Variance improvement 100%.
2025-02-11T16:58:39Z|00400|dpif_netdev|DBG|Numa node 0. Current variance 0 Estimated variance 0. Variance improvement 100%.
- is duplicated by
-
FDP-1157 [ OVS-3.1] OVS auto load balance incorrect debug log
-
- Closed
-
-
FDP-1158 [ OVS-3.1] OVS auto load balance incorrect debug log
-
- Closed
-
-
FDP-1159 [ OVS-3.1] OVS auto load balance incorrect debug log
-
- Closed
-
-
FDP-1160 [ OVS-3.1] OVS auto load balance incorrect debug log
-
- Closed
-
- links to
-
RHSA-2025:146379 openvswitch3.1 security update