Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Critical
Fix Version/s: None
Affects Version/s: rhel-9
Component/s: openvswitch3.3
Labels:
None

Story Points:
8
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
OS:
rhel-9
Planning:
None
AssignedTeam:
rhel-net-ovs-dpdk
Intelligence Requested:
Market:
Sub-System Group:

ssg_networking

Sprint:
OVS/DPDK - FDP-25.E - 1, FDP-OVS/DPDK Sprint 7
sprint_count:
2
Customer Impact:

Customer Escalated, Customer Facing, Customer Reported

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Previously in ~~FDP-1474~~ and ~~FDP-1273~~ a customer identified that a critical learn action was not updating openflow rules as it should during a bandwidth test that included a vxlan tunnel failover. The failover caused a large mount of GARP/ND traffic from the new VXLAN tunnel, but OVS continued to send traffic to the old tunnel.

Upon investigation we found that in this situation OVS treated GARP/ND traffic equally to all other traffic. If any packets happened to be processed from the old tunnel after the failover garp packets were received, the learn rule would just update the out of date return path.

This setup is complex, so I created a reproducer environment to help elucidate the configuration. It is attached to this ticket.

Both in my reproduction environment and on the client systems, the following workaround which causes GARP/ND traffic to create a higher priority flow when learned resolved the issue:

ovs-ofctl add-flow br-tun "table=10,arp,arp_tha=ff:ff:ff:ff:ff:ff,priority=2 actions=learn(table=20,hard_timeout=300,priority=2,cookie=0x4291b5d8aea40b08,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:patch-int"
ovs-ofctl add-flow br-tun "table=10,icmp6,icmp_type=134,priority=2 actions=learn(table=20,hard_timeout=300,priority=2,cookie=0x4291b5d8aea40b08,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:patch-int"
ovs-ofctl add-flow br-tun "table=10,icmp6,icmp_type=136,priority=2 actions=learn(table=20,hard_timeout=300,priority=2,cookie=0x4291b5d8aea40b08,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:OXM_OF_IN_PORT[]),output:patch-int"

Three possible solutions are:

Modify OVS's action xlate to bump the priority during GARP/ND. This solution is effective, but also violates the specification of what learn is supposed to do.
Make learned flow_add / revalidation more sensitive to when packets are received or favor keeping new different flows over older. This will not directly resolve this issue and may just mask it.
Modify the ML2/OVS interface to maintain GARP/ND rules along with learn actions.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

worklog11
13 kB
2025/07/29 2:21 PM

is triggering

OSPRH-18938 Investigate Neutron workaround for FDP-1562

Closed

relates to

FDP-1273 With pmd-maxsleep set, OVS MAC tables on client OVS DPDK not updating correctly upon F5 virtual Load Balancers HA switchover

Closed

FDP-1474 [Investigation spike] OVS MAC tables on client OVS DPDK not updating correctly upon F5 virtual Load Balancers HA switchover

Closed

Assignee:: ovsdpdk triage

Reporter:: Mike Pattrick

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2025/07/29 2:17 PM

Updated:: 2025/09/13 11:50 PM

Resolved:: 2025/08/19 3:54 AM

Details

Description

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates