This is tracking the upstream effort needed to deliver the solution to the bug described below.
Problem Description: Clearly explain the issue.
We are investigating a communication issue related to Allowed Address Pairs (AAP) in a RHOSO (Neutron + OVN) environment when using VRRP-based VIPs provided by an external network device (IPCOM).
Observed behavior:
- The external LB device (IPCOM) provides VIPs using VRRP v3 (RFC 5798), Not using octavia.
- As defined in RFC 5798 8.1.2[1], ARP replies for the VIP may contain:
- Ethernet source MAC = physical MAC
- ARP sender MAC (arp_sha) = VRRP virtual MAC - OVN Port Security enforces ARP validation based on permitted tuples, and ARP packets where eth.src ≠ arp.sha are dropped[2].
- As a result, ARP replies for the VIP are dropped by OVN, and connectivity to the VIP fails.
In the customer's environment, From the tcpdump logs,the ARP reply for the VIP address is sent from the VM's original MAC address(fa:16:3e:04:b7:ca), not from the virtual MAC address 00:00:5e:00:01:0a.
[tsaito@supportshell-1 IPCOM_VE2m_LS_primary]$ tcpdump -r tap008c7f2b-d3.pcap -nn -e | grep 192.168.10.201 reading from file tap008c7f2b-d3.pcap, link-type EN10MB (Ethernet), snapshot length 262144 04:56:26.754881 fa:16:3e:ae:fd:29 > ff:ff:ff:ff:ff:ff, ethertype ARP (0x0806), length 42: Request who-has 192.168.10.201 tell 192.168.10.116, length 28 04:56:26.755204 fa:16:3e:04:b7:ca > fa:16:3e:ae:fd:29, ethertype ARP (0x0806), length 42: Reply 192.168.10.201 is-at 00:00:5e:00:01:0a, length 28 ^^^^^^^^^^^^^^^^^
If "arp_sha" and "dl_src" are different, OVN will drop this flow.
Understanding:
This behavior appears to be a design mismatch between:
RFC-compliant VRRP ARP behavior, and OVN Port Security assumptions (ARP sender MAC and Ethernet source MAC are identical).
Questions (workaround confirmation):
Could you please advise on the following points? They would like to know is whether there is a mitigation(workaround) or not
Q1. Is there any supported or recommended workaround in RHOSO/OVN to allow VRRP ARP replies where eth.src ≠ arp.sha?
Q2. Is adding custom OpenFlow rules manually on br-int (for example as below) ever considered acceptable or supported?
ovs-ofctl add-flow br-int 'cookie=0x2c0bb579,table=74,priority=90,arp,reg14=0x5,metadata=0x2,dl_src=fa:16:3e:04:b7:ca,arp_spa=192.168.10.201,arp_sha=00:00:5e:00:01:0a,actions=load:0->NXM_NX_REG10[12]'
Q3. From a support perspective, is VRRP (RFC 5798) interoperability with OVN Port Security considered out of scope, or is there any documented guidance?
Q4. Partner is commenting below, they said that It is working in their existing environment (OSP13 env which got ELS extension individually).
this configuration is possible with the current FJcloud-O (OSP13) and appears to be a regression in OSO18
In their OSP13 env, they are using IPCOM with Nuage(NOKIA) as a network component.
IPCOM is third party appliance of FUJITSU and It is handled as non-Red Hat components[3], Furthermore, IPCOM is not certified as far as confirming catalog[4].
So I am thinking their understanding is not correct (It is not regression). Is my understanding correct?
[1] https://datatracker.ietf.org/doc/html/rfc5798#section-8.1.2
[2] https://man7.org/linux/man-pages/man8/ovn-northd.8.html
Egress Table 12: Egress Port Security - check (edited)
[3] https://access.redhat.com/articles/third-party-software-support
[4] https://catalog.redhat.com/en/search?searchType=All
Impact Assessment: Describe the severity and impact
- Partner is CCSP partner, they are trying create RHOS for their next thier cloud service.
- They would like to use IPCOM(Their LB) as a External LB).
- There is no workaround now, If there are not any workaround, which will impact their rservice release schedule.
Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).
openstack oeprator 18.0.9
rhoso-ovn-24.03-4.el9ost.noarch
ovn24.03-24.03.5-73.el9fdp.x86_64
Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).
new issue.
Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.
Reproduction Steps: Provide detailed steps or scripts to replicate the issue.
It always reproduced.
Reproduced step is
- Using External LB(IPCOM) for RHOSO
- Creating VIP
- Setting allowed address pair based on https://access.redhat.com/solutions/6629051
Expected Behavior: Describe what should happen under normal circumstances.
The ping to the vip is successful
Observed Behavior: Explain what actually happens.
The ping to the vip is not successful
Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.
- There is no way to resolve this issue, Customer needs the workaround to avoid thi s issue.
Logs: If you collected logs please provide them (e.g. sos report, /var/log/openvswitch/ , testpmd console)*
- Atthatched the logs on https://access.redhat.com/support/cases/#/case/04331820.
- This [1] is the detailed explantipn of this issue.
[1] https://access.redhat.com/support/cases/#/case/04331820/discussion?commentId=a0aHn00000a7lZYIAY
- ipcom_config.txt ipconfig(LB) configuration
https://access.redhat.com/support/cases/#/case/04331820/discussion?attachmentId=a09Hn00004xwXOiIAM
- port_MAC_address(fa_16_3e_xx_xx_xx).zip
their latest test log with packet capture.test was done based on https://access.redhat.com/solutions/6629051,
https://access.redhat.com/support/cases/#/case/04331820/discussion?attachmentId=a09Hn00004xwXOOIA2