-
Epic
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
None
-
OVN forwards IP multicast packets towards routers with IGMP relay disabled
-
5
-
False
-
False
-
-
rhel-9
-
rhel-net-ovn
-
100% To Do, 0% In Progress, 0% Done
-
ssg_networking
This epic tracks all the effort needed to deliver the solution related to the bug described below.
Problem Description: Clearly explain the issue.
OVN supports IGMP snooping (learning of IP multicast group memberships) on logical switches and IGMP relay (routing IP multicast packets) on logical routers.
However, logical switches should not forward IP multicast packets towards connected OVN logical routers if the router is not configured to relay IP multicast traffic (LR.options:mcast_relay=false).
From a traffic perspective there's no direct problem if such packets are sent to routers, however this generates unnecessary resubmits the egress pipeline (towards each router) and parts of the ingress pipeline (on each router port) get executed for packets that could be dropped earlier, in the ingress pipeline of the switch. If the switch is connected to a large number of routers this can lead to the max resubmit limit (4K) to be hit by such packets and unnecessarily overloading ovs-vswitchd.
Some examples of configurations (testable in an OVN sandbox) in which OVN currently forwards IP multicast packets to routers that immediately drop them can be found below.
ovn-nbctl \ -- lr-add lr \ -- lrp-add lr lrp1 00:00:00:00:00:01 1.1.1.1/24 \ -- ls-add ls \ -- lsp-add ls lsp \ -- lsp-add-router-port ls ls-lr lrp1
Or:
ovn-nbctl \ -- lr-add lr \ -- lrp-add lr lrp1 00:00:00:00:00:01 1.1.1.1/24 \ -- ls-add ls \ -- lsp-add ls lsp \ -- lsp-add-router-port ls ls-lr lrp1 \ -- set logical_switch ls other_config:mcast_snoop=true \ -- set logical_switch ls other_config:mcast_flood_unregistered=true
in both cases, simulating an IP multicast packet received on port "lsp" we can see that OVN would forward it to the router even though the router would just always drop it:
> ovn-trace 'inport == "lsp" && eth.src == 00:00:00:00:01:00 && eth.dst == 01:00:5e:00:01:2a && ip4.src == 1.1.1.42 && ip4.dst == 239.0.1.68 && ip.ttl == 64 && udp.src == 42 && udp.dst == 43' # udp,reg14=0x1,vlan_tci=0x0000,dl_src=00:00:00:00:01:00,dl_dst=01:00:5e:00:01:2a,nw_src=1.1.1.42,nw_dst=239.0.1.68,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=42,tp_dst=43 ingress(dp="ls", inport="lsp") ------------------------------ 0. ls_in_check_port_sec (northd.c:9766): 1, priority 50, uuid 92bbfb49 reg0[15] = check_in_port_sec(); next; 6. ls_in_pre_lb (northd.c:6346): eth.mcast, priority 110, uuid ff71e371 next; 31. ls_in_l2_lkup (northd.c:10501): eth.mcast, priority 70, uuid e7cd70aa outport = "_MC_flood"; output; multicast(dp="ls", mcgroup="_MC_flood") --------------------------------------- egress(dp="ls", inport="lsp", outport="ls-lr") ---------------------------------------------- 3. ls_out_pre_lb (northd.c:6348): eth.mcast, priority 110, uuid b73c2b1d next; 13. ls_out_check_port_sec (northd.c:6038): eth.mcast, priority 100, uuid f88fba96 reg0[15] = 0; next; 14. ls_out_apply_port_sec (northd.c:6048): 1, priority 0, uuid e51fa39f output; /* output to "ls-lr", type "patch" */ ingress(dp="lr", inport="lrp1") ------------------------------- 0. lr_in_admission (northd.c:13829): eth.mcast && inport == "lrp1", priority 50, uuid 86a5c48d xreg0[0..47] = 00:00:00:00:00:01; next; 1. lr_in_lookup_neighbor (northd.c:14019): 1, priority 0, uuid dd12d9b7 reg9[2] = 1; next; 2. lr_in_learn_neighbor (northd.c:14029): reg9[2] == 1, priority 100, uuid 9dc0a0cf mac_cache_use; next; 3. lr_in_ip_input (northd.c:15822): ip4.mcast || ip6.mcast, priority 82, uuid b2960199 drop; egress(dp="ls", inport="lsp", outport="lsp") -------------------------------------------- /* omitting output because inport == outport && !flags.loopback */
Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).
Unnecessary CPU load of ovs-vswitchd in topologies with a large number (hundreds) of routers connected to logical switches due to IP multicast packets.
Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).
All currently supported OVN streams.
Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).
Day one issue (since IGMP relay has been added to OVN).
Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.
Yes, see above.
Reproduction Steps: Provide detailed steps or scripts to replicate the issue.
See above.
Expected Behavior: Describe what should happen under normal circumstances.
If the router doesn't have IGMP relay enabled, we should drop the IP multicast packets early, in the switch pipeline.
Observed Behavior: Explain what actually happens.
IP multicast packets are sent to routers unnecessarily.
Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.