-
Bug
-
Resolution: Done-Errata
-
Critical
-
None
-
None
-
None
-
5
-
False
-
-
False
-
-
ovn25.03-25.03.0-45.el9fdp
-
rhel-9
-
None
-
rhel-net-ovn
-
-
-
ssg_networking
-
OVN FDP 25.D
-
1
-
Critical
-
+
Problem Description: Clearly explain the issue.
In general, if an OVN logical switch has at least one stateful ACL (action=allow-related) in any direction or if the switch has a load balancer attached then all traffic processed on that switch should be committed to conntrack (to allow proper handling of replies).
However, in a multi-tier ACL scenario, if the first packet of a session is processed on an ACL chain that ends with action=pass then the session is never committed to conntrack in the zone of the ingress/egress port (depending of ACL direction).
This breaks traffic because the replies will have ct_state=+trk+inv and the default logical switch behavior is to drop such packets.
Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).
This breaks traffic that is forwarded on such ACL chains. In OpenShift/ovn-kubernetes this can happen if the admin configures (B)ANP (admin netpol) rules to pass specific traffic.
Some workarounds are available:
- set NB.NB_Global.use_ct_inv_match=true: this breaks hardware offload and is a global setting that can't be tuned for each switch individually
- add a very low priority ACL to the highest available tier with default action=allow-related: this is not always possible (depending on priorities) and requires CMS assistance.
Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).
ovn25.03-25.03.0-30.el9fdp
However this bug is present on ALL currently supported OVN versions that are newer than 23.03.
Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).
This is a day-1 issue (since 23.06.0 when multi-tier ACLs support has been added to OVN).
Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.
Yes, this can be consistently reproduced, see below.
Reproduction Steps: Provide detailed steps or scripts to replicate the issue.
# Switch with two ports. # Allow-related ACL in the to-lport direction => all "allow" ACLs should be # considered "allow-related" and should commit to conntrack. ovn-nbctl ls-add ls ovn-nbctl lsp-add ls p1 ovn-nbctl lsp-set-addresses p1 00:00:00:00:00:01 ovn-nbctl lsp-add ls p2 ovn-nbctl lsp-set-addresses p2 00:00:00:00:00:02 # High prio tier0 rule to allow all ARP traffic. ovn-nbctl --tier=0 acl-add ls from-lport 30 arp allow # prio 20 acl in tier 0 to pass to tier 1 all traffic from p1 with destination 10.0.0.0/8. ovn-nbctl --tier=0 acl-add ls from-lport 20 'inport == "p1" && ip4.dst == 10.0.0.0/8' pass # prio 10 acl in tier 0 to drop everything else from p1. ovn-nbctl --tier=0 acl-add ls from-lport 10 'inport == "p1"' drop # prio 20 acl in tier 1 to drop everything from 20.0.0.0/8 ovn-nbctl --tier=1 acl-add ls from-lport 20 'ip4.src == 20.0.0.0/8' drop # prio 10 acl in tier 1 to pass everything from 10.0.0.0/8 ovn-nbctl --tier=1 acl-add ls from-lport 20 'ip4.src == 10.0.0.0/8' pass # to-lport ACL allowing all traffic from p1 to p2. ovn-nbctl --tier=0 acl-add ls to-lport 20 'inport == "p1" && outport == "p2"' allow-related # Drop all other traffic to p2 ovn-nbctl --tier=0 acl-add ls to-lport 10 'outport == "p2"' drop # Trace an ICMP echo from p1 to p2 (notice how there's no CT commit!): ovn-trace --ct new 'inport=="p1" && eth.src==00:00:00:00:00:01 && eth.dst==00:00:00:00:00:02 && ip4.src==10.0.0.10 && ip4.dst==10.0.0.20 && ip.ttl==64 && icmp' # Bind p1 and p2 and try to send traffic. ovs-vsctl add-port br-int p1 -- set interface p1 type=internal ip netns add p1 ip link set p1 netns p1 ip netns exec p1 ip link set p1 address 00:00:00:00:00:01 ip netns exec p1 ip addr add 10.0.0.10/8 dev p1 ip netns exec p1 ip link set p1 up ovs-vsctl set Interface p1 external_ids:iface-id=p1 ovs-vsctl add-port br-int p2 -- set interface p2 type=internal ip netns add p2 ip link set p2 netns p2 ip netns exec p2 ip link set p2 address 00:00:00:00:00:02 ip netns exec p2 ip addr add 10.0.0.20/8 dev p2 ip netns exec p2 ip link set p2 up ovs-vsctl set Interface p2 external_ids:iface-id=p2 # Try to ping from p1 to p2: $ ip netns exec p1 ping 10.0.0.20 <fails> # Reply packets are dropped because there's no conntrack entry in p1's zone: $ ovs-appctl dpctl/dump-flows | grep '+inv' recirc_id(0x4),in_port(3),ct_state(-new-est-rel-rpl+inv+trk),ct_mark(0/0x1),eth(dst=00:00:00:00:00:00/ff:ff:00:00:00:00),eth_type(0x0800),ipv4(frag=no), packets:68, bytes:6664, used:0.337s, actions:drop # Get p1's conntrack zone: $ ovn-appctl ct-zone-list | grep p1 p1 3 # Confirm there's no entry in p1's zone: $ conntrack -L | grep zone=3 -c conntrack v1.4.7 (conntrack-tools): 89 flow entries have been shown. 0
Expected Behavior: Describe what should happen under normal circumstances.
If the last ACL action executed for a given packet is "pass" and the logical switch the packet is processed on has stateful features configured (allow-related ACL or load balancers) then the packet's session should be committed to conntrack.
Observed Behavior: Explain what actually happens.
The original packet's session is not committed to conntrack causing all replies on that session to be dropped.
Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.
See reproducer steps.
Logs: If you collected logs please provide them (e.g. sos report, /var/log/openvswitch/* , testpmd console)
- blocks
-
OCPBUGS-52462 [BGP Isolation Break] two UDN pod can be accessed after BGP advertised to default VRF
-
- Verified
-
- is cloned by
-
FDP-1393 CLONE [ovn25.09 fast-datapath-rhel-9] - Multi-tier ACL chains that end with action "pass" don't commit to conntrack.
-
- Verified
-
-
FDP-1394 CLONE [ovn24.03 fast-datapath-rhel-9] - Multi-tier ACL chains that end with action "pass" don't commit to conntrack.
-
- Closed
-
-
FDP-1395 CLONE [ovn24.09 fast-datapath-rhel-9] - Multi-tier ACL chains that end with action "pass" don't commit to conntrack.
-
- Closed
-
- links to
-
RHBA-2025:149334 ovn25.03 bug fix and enhancement update