Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-684

OVN incorrectly relies on conntrack to reassemble IP fragments

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Major Major
    • None
    • None
    • OVN
    • 13
    • False
    • Hide

      None

      Show
      None
    • False
    • rhel-sst-network-fastdatapath
    • ssg_networking
    • OVN Sprint 41, OVN Sprint 42, OVN Sprint 43, OVN Sprint 44, OVN Sprint 45

      Current version of OVN pipelines relies on L4 port matching in some
      of the OpenFlow rules. But matching on them is not possible in case
      of later IP fragments. For that reason OVN even has a special stage
      called 'lr_in_defrag' that is passing all the traffic through conntrack
      in hopes that conntrack will re-assemble the packet. But that is not
      a generally correct assumption.

      OpenFlow defines several operation modes for a switch:

      • OFPC_FRAG_NORMAL = 0, /* No special handling for fragments. */
      • OFPC_FRAG_DROP = 1, /* Drop fragments. */
      • OFPC_FRAG_REASM = 2, /* Reassemble (only if OFPC_IP_REASM set). */

      Open vSwitch has a following extension for the list:

      • OFPC_FRAG_NX_MATCH = 3, /* Make first fragments available for matching. */

      OFPC_FRAG_NX_MATCH is default mode in OVS and it does not support
      OFPC_FRAG_REASM.

      So, from the OpenFlow point of view, users cannot use L4 information
      on later fragments and users cannot expect Open vSwitch to reassemble
      fragmented IP packets.

      The fact that kernel conntrack does reassemble IP fragments is an
      unfortunate side effect of re-using kernel connection tracking
      implementation. And it is causing a lot of issues for OVS pipeline.
      A few examples are: Necessity to re-fragment reassembled packets
      back on egress. This is a problem, because re-fragmentation in theory
      has to slice the packets in the exact fragments it was sliced before.
      It's hard to do that, and current implementation can lead to forwarding
      back NEEDS_FRAG replies for fragments never sent from the source.
      Also, OVS actions like truncated output and check_pkt_len may not work
      correctly, because reassembled packet obviously have different length.
      See some lengthy discussions on this patch:
      https://lore.kernel.org/all/20210319204307.3128280-1-aconole@redhat.com/

      And the final point is that userspace conntrack behaves differently.
      It doesn't reassemble fragmented packets, but releases all the fragments
      back to the datapath in exactly same form they came into it. So, after
      ct() action these packets have a conntrack state, but they are still
      separate IP fragments, so L4 information cannot be matched.

      So, ability to reassemble IP fragments is not backed up by neither
      OpenFlow specification nor different datapath implementations.

      In general, OVN cannot rely on IP fragments being reassembled and needs
      to build pipelines accordingly. One potential solution could be to
      use ct mark/label to store required L4 information, so it will be available
      to later fragments, or ct_tp_src/dst might be utilized somehow. But I
      didn't think this through.

      Alternative might be to make userspace datapath mimic the kernel conntrack.
      However, that might be not acceptable due to, likely, noticeable performance
      drop and potential requirement to re-work the whole dp-packet management
      in order to accommodate multi-buffer packets, which was unsuccessfully
      tried before several times.

              amusil@redhat.com Ales Musil
              imaximet@redhat.com Ilya Maximets
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: