-
Story
-
Resolution: Unresolved
-
Normal
-
None
-
FDP-24.H
-
None
-
8
-
False
-
-
False
-
rhel-9
-
rhel-sst-network-fastdatapath
-
-
-
ssg_networking
-
Important
As was revealed during an investigation of long network downtime during live migration of a DPDK port, when vhostuserguest DPDK ports are used, then qemu does not inject RARP when live migration is complete, but offloads this to guest virtio driver. For this, it calls to VIRTIO_NET_F_GUEST_ANNOUNCE interface, which then sends GARPs (not RARPs).
The current single activation strategy implemented in OVN is activation-strategy=rarp. This strategy, as the name implies, handles RARP packets. But not GARP. This results in unnecessarily long network downtime when migrating DPDK ports in Neutron (which uses activation-strategy=rarp).
The request here is to handle GARPs and NAs injected by virtio the same was as RARPs that are sent by qemu.
There are two options here: either
- the existing `rarp` strategy is extended to also handle GARP packets; or
- a new strategy is introduced that would handle both types. (I suggest `guest_announce` as the name for the strategy.)
The former option is simpler in implementation since it won't require Neutron to change the setting it uses for migrated ports. The drawback here is that then the name of the strategy arguably doesn't represent the actual implementation (since it would no longer handle just RARP packets.)
The latter option would allow for a better name. On the other hand, it would require a change in Neutron and other CMSs that may use multi-chassis bindings for DPDK live migration. In this case, Neutron would also have to deal with the fact that older OVN releases do not support the "newer" strategy (e.g. Neutron would have to carry a config knob to pick the desired strategy if needed; or it would have to use some heuristics - or an official interface to detect OVN features, not sure if it's a thing? - to determine which value to use).
Another consideration is whether one or the other option is more politically feasible for a backport. Since the linked BZ is tied to a customer case escalation, and since we expect it to be fixed one way or another for the customer ASAP, we may not be able to wait for a new major OVN version bumped in RHOSP repos. (Note that the target release for the bug report is 17.x that no longer even plans to have any point releases; only async releases as needed.)
–
One other option to tackle this discrepancy between DPDK and non-DPDK behavior is patching kernel to issue RARP packets on VIRTIO_NET_F_GUEST_ANNOUNCE call in addition to GARPs. This avenue will be explored elsewhere. Regardless, we don't anticipate that kernel patch would be implemented in reasonable time. Regardless, we cannot rely on all guests running a new kernel with patched virtio.
That's why we'd like OVN to handle it.
- relates to
-
RHEL-72028 Always send RARP regardless of GUEST_ANNOUNCE support in virtio
- New