Uploaded image for project: 'Fast Datapath Product'
  1. Fast Datapath Product
  2. FDP-2869

[ovn-northd] Undefined behavior due to ICMP TTL exceeded logical flows for IPv4

    • Icon: Epic Epic
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • OVN
    • None
    • [ovn-northd] Undefined behavior due to ICMP TTL exceeded logical flows for IPv4
    • False
    • Hide

      None

      Show
      None
    • False
    • Hide

      Please mark each item below with ( / ) if completed or ( x ) if incomplete:

      ( ) The acceptance criteria defined below are met.


      ( ) The epics work is available in a downstream build (nightly/Async or other)


      ( ) All cards under the epic have been moved to Done

      Show
      Please mark each item below with ( / ) if completed or ( x ) if incomplete: ( ) The acceptance criteria defined below are met. ( ) The epics work is available in a downstream build (nightly/Async or other) ( ) All cards under the epic have been moved to Done
    • rhel-9
    • rhel-net-ovn
    • 100% To Do, 0% In Progress, 0% Done
    • ssg_networking

      This epic tracks all the effort needed to deliver the solution related to the bug described below.

       Problem Description: Clearly explain the issue.

      The northd generated logical flows that check for TTL exceeded for IPv4 networks configured on router ports do not match only packets received on a given network.  That means that when a router port has multiple IPs, from different networks, we multiple flows with the same match but different actions.

      E.g., in a sandbox:

       > ovn-nbctl lr-add lr \
          -- lrp-add lr lrp 00:00:00:00:00:01 \
                            1.1.1.1/24 2.2.2.2/24 \
                            1::1/64 2::2/64

      Checking the TTL expired generated flows:

       > ovn-sbctl lflow-list lr | grep in_ip_input | grep 'TTL exceeded' | grep icmp4
        table=3 (lr_in_ip_input     ), priority=31   , match=(inport == "lrp" && ip4 && ip.ttl == {0, 1} && !ip.later_frag), action=(icmp4 {eth.dst <-> eth.src; icmp4.type = 11; /* Time exceeded */ icmp4.code = 0; /* TTL exceeded in transit */ ip4.dst = ip4.src; ip4.src = 1.1.1.1 ; ip.ttl = 254; outport = "lrp"; flags.loopback = 1; output; };)
        table=3 (lr_in_ip_input     ), priority=31   , match=(inport == "lrp" && ip4 && ip.ttl == {0, 1} && !ip.later_frag), action=(icmp4 {eth.dst <-> eth.src; icmp4.type = 11; /* Time exceeded */ icmp4.code = 0; /* TTL exceeded in transit */ ip4.dst = ip4.src; ip4.src = 2.2.2.2 ; ip.ttl = 254; outport = "lrp"; flags.loopback = 1; output; };)

      Both flows have exactly the same match but different actions (the src IP of the generated ICMP error is different).  Which means that depending on the order of operations in ovn-controller (or recomputations) either one of the resulting OpenFlow rules will be used, essentially replying with ICMP errors from source IPs that might not be in the same network as the original packet source.

      For IPv6 on the other hand:

       > ovn-sbctl lflow-list lr | grep in_ip_input | grep 'TTL exceeded' | grep icmp6
        table=3 (lr_in_ip_input     ), priority=31   , match=(inport == "lrp" && ip6 && ip6.src == 1::/64 && ip.ttl == {0, 1} && !ip.later_frag), action=(icmp6 {eth.dst <-> eth.src; ip6.dst = ip6.src; ip6.src = 1::1 ; ip.ttl = 254; icmp6.type = 3; /* Time exceeded */ icmp6.code = 0; /* TTL exceeded in transit */ outport = "lrp"; flags.loopback = 1; output; };)
        table=3 (lr_in_ip_input     ), priority=31   , match=(inport == "lrp" && ip6 && ip6.src == 2::/64 && ip.ttl == {0, 1} && !ip.later_frag), action=(icmp6 {eth.dst <-> eth.src; ip6.dst = ip6.src; ip6.src = 2::2 ; ip.ttl = 254; icmp6.type = 3; /* Time exceeded */ icmp6.code = 0; /* TTL exceeded in transit */ outport = "lrp"; flags.loopback = 1; output; };)

      The logical flows also match on ip6.src == <network>, correctly restricting the match.

       Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).

      Incorrect ICMP v4 TTL exceeded generation of packets.

       Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).

      All currently supported OVN versions:
      ovn24.03-24.03.7-17
      ovn24.09-24.09.3-85
      ovn25.03-25.03.2-21
      ovn25.09-25.09.2-10

        Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).

      new issue

       Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.

      Yes, see description.

       Reproduction Steps: Provide detailed steps or scripts to replicate the issue.

      See description.

       Expected Behavior: Describe what should happen under normal circumstances.

      northd generated flows should be consistent.

       Observed Behavior: Explain what actually happens.

      see description

       Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.

       

       Logs: If you collected logs please provide them (e.g. sos report, /var/log/openvswitch/* , testpmd console)

              ovnteam@redhat.com OVN Team
              dceara@redhat.com Dumitru Ceara
              OVN QE OVN QE (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: