Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-4036

OVN load balancer does not support pmtu

XMLWordPrintable

    • -
    • Important
    • Proposed
    • False
    • Hide

      None

      Show
      None

      "Description of problem:
      OVN load balancer does not support pmtu
      OpenShift 4.11.12 platform with OVNKubernetes and metallb. Physical network supports jumbo frame. we do have other platforms with jumbo frame configured. However not all network switches support jumbo frame, so we rely on pmtu for the client and servers to adjust their payload side.

      When an endpoints sends a payload that is too large, the physical switch will send a packet " ICMP 10.91.175.200 unreachable - need to frag (mtu 1500)" that will allow the endpoint to adjust its payload side.

      These packets seem ignored, leading to traffic being cut. Scenario is:
      10.64.254.129 is a loadbalancer service external IP running on the openshift platform, exposed by metallb.
      10.91.175.200 is a client trying to connect to the service.

      client does a curl https://10.64.254.129. Connection is established with MSS values matching both side MTU:
      09:43:27.258778 IP (tos 0x28, ttl 47, id 26252, offset 0, flags [DF], proto TCP (6), length 60)
      10.91.175.200.44036 > 10.64.254.129.https: Flags [S], cksum 0x22d7 (correct), seq 3050964400, win 26880, options [mss 8960,sackOK,TS val 2958279257 ecr 0,nop,wscale 7], length 0
      09:43:27.263299 IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF], proto TCP (6), length 60)
      10.64.254.129.https > 10.91.175.200.44036: Flags [S.], cksum 0xbc17 (correct), seq 489867020, ack 3050964401, win 26544, options [mss 8860,sackOK,TS val 4169370527 ecr 2958279257,nop,wscale 7], lengt
      h 0

      at some point during the tls handshake, 10.64.254.129 sends a packet that is too big for the physical network. The network switch replies a bunch of
      09:43:27.311818 IP (tos 0xc0, ttl 53, id 9113, offset 0, flags [none], proto ICMP (1), length 576)
      192.168.254.128 > 10.64.254.129: ICMP 10.91.175.200 unreachable - need to frag (mtu 1500), length 556
      IP (tos 0x0, ttl 58, id 11009, offset 0, flags [DF], proto TCP (6), length 4148)
      10.64.254.129.https > 10.91.175.200.44036: Flags [P.], seq 1:4097, ack 518, win 216, options [nop,nop,TS val 4169370575 ecr 2958279306], length 4096

      however these packets seem ignored and the service does not reduce its payload.

      while gathering information about this problem, I see the question has been asked before on https://mail.openvswitch.org/pipermail/ovs-discuss/2020-December/050834.html

      Version-Release number of selected component
      ovn22.06-22.06.0-27.el8fdp.x86_64
      (OpenShift 4.11.12)

      How reproducible:
      all the time

      Steps to Reproduce:

      1.OpenShift 4.11.12 platform with OVNKubernetes and metallb on an environment having
      2. run a curl connecting to it, from an environment having jumbo frames
      3. configure the physical network to not support jumbo frames and not do any MSS clamping

      Actual results:
      when the service behind the loadbalancer reply with a payload too big, the physical network replies ICMP unreachable - need to frag packets. However these packets are not forwarded to the

      Expected results:
      loadbalancer should forward the ICMP packet to the service so that it can adjust its payload side

      Additional info:
      seen before on https://github.com/submariner-io/submariner/issues/1022"

            sseethar Surya Seetharaman
            rhn-support-atn Anand T N (Inactive)
            Anurag Saxena Anurag Saxena
            Anand T N (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: