-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
13
-
False
-
-
False
-
rhel-sst-network-fastdatapath
-
-
-
ssg_networking
-
OVN Sprint 41, OVN Sprint 42, OVN Sprint 43, OVN Sprint 44, OVN Sprint 45
Current version of OVN pipelines relies on L4 port matching in some
of the OpenFlow rules. But matching on them is not possible in case
of later IP fragments. For that reason OVN even has a special stage
called 'lr_in_defrag' that is passing all the traffic through conntrack
in hopes that conntrack will re-assemble the packet. But that is not
a generally correct assumption.
OpenFlow defines several operation modes for a switch:
- OFPC_FRAG_NORMAL = 0, /* No special handling for fragments. */
- OFPC_FRAG_DROP = 1, /* Drop fragments. */
- OFPC_FRAG_REASM = 2, /* Reassemble (only if OFPC_IP_REASM set). */
Open vSwitch has a following extension for the list:
- OFPC_FRAG_NX_MATCH = 3, /* Make first fragments available for matching. */
OFPC_FRAG_NX_MATCH is default mode in OVS and it does not support
OFPC_FRAG_REASM.
So, from the OpenFlow point of view, users cannot use L4 information
on later fragments and users cannot expect Open vSwitch to reassemble
fragmented IP packets.
The fact that kernel conntrack does reassemble IP fragments is an
unfortunate side effect of re-using kernel connection tracking
implementation. And it is causing a lot of issues for OVS pipeline.
A few examples are: Necessity to re-fragment reassembled packets
back on egress. This is a problem, because re-fragmentation in theory
has to slice the packets in the exact fragments it was sliced before.
It's hard to do that, and current implementation can lead to forwarding
back NEEDS_FRAG replies for fragments never sent from the source.
Also, OVS actions like truncated output and check_pkt_len may not work
correctly, because reassembled packet obviously have different length.
See some lengthy discussions on this patch:
https://lore.kernel.org/all/20210319204307.3128280-1-aconole@redhat.com/
And the final point is that userspace conntrack behaves differently.
It doesn't reassemble fragmented packets, but releases all the fragments
back to the datapath in exactly same form they came into it. So, after
ct() action these packets have a conntrack state, but they are still
separate IP fragments, so L4 information cannot be matched.
So, ability to reassemble IP fragments is not backed up by neither
OpenFlow specification nor different datapath implementations.
In general, OVN cannot rely on IP fragments being reassembled and needs
to build pipelines accordingly. One potential solution could be to
use ct mark/label to store required L4 information, so it will be available
to later fragments, or ct_tp_src/dst might be utilized somehow. But I
didn't think this through.
Alternative might be to make userspace datapath mimic the kernel conntrack.
However, that might be not acceptable due to, likely, noticeable performance
drop and potential requirement to re-work the whole dp-packet management
in order to accommodate multi-buffer packets, which was unsuccessfully
tried before several times.
- is blocked by
-
FDP-124 Userspace conntrack doesn't populate ct_tp_src/dst for later IP fragments
- New
- external trackers