Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-53066

Egress TCP sessions broken on live migrations of UDN-connected VMs

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • CNV Network
    • None
    • 0.42
    • False
    • Hide

      None

      Show
      None
    • False
    • ---
    • ---
    • None

      Description of problem:

      VMs connected to the layer2 UDN, acting as a client in TCP-based communication, suffer from disconnect during live migration.
      
      For example, if a UDN-connected VM is used as a virtual desktop, and its user is downloading a big file from a remote server, the download would be aborted when the VM migrates. Note that the VM user is typically unaware of migrations - they are triggered automatically to perform upgrade or maintenance on the cluster.
      
      This disconnect happens due to the fact that any traffic egressing from an overlay OVN network is source-NAT'ed with the IP of the node the VM/Pod currently resides on. During live-migration, the node changes, so the IP changes, and the conntrack session is lost as well. Note that the connectivity downtime itself is a matter for <200ms.
      
      We considered a possible solution to this using an existing feature of OCP - egressIP: This would allow us to pin the source IP to a specific IP address, assigned to one specific node. While this would solve the disconnect during live migration, we would still suffer it when a node owning given IP goes down.
      
      There are three suggested options how to address this issue with not-yet-implemented code:
      
      1) Have OVN use a VIP for egress IP and copy the conntrack session to the new node during live migration. This is unexplored and we suspect it is non-trivial to implement.
      2) Introduce a new "gateway pod" functionality, where a pod connected to both the UDN and the physical network would be used as the gateway of the given UDN. This router could implement the needed logic to preserve connectivity. This is similar to option 1) except it's keeping most of the logic outside of code OVN.
      3) Rely on BGP. IIUIC, this is something the SDN team is working on for 4.18. Our assumption is that with BGP, the traffic would be routed, not NAT'ed, hence there would be no connection tracking or SNATing employed. It is not clear whether BGP will introduce another node-specific tracking that would need special handling for live-migration.
      
      We will probably need both 1/2) and 2) implemented in the long run. While BGP 3) routed ingress is something VMware users are asking for, a solution without requirements on BGP 1/2) may be more suitable for support of public clouds, without any assumptions about the underlying network infrastructure.

      Version-Release number of selected component (if applicable):

      4.18

      How reproducible:

      Always

      Steps to Reproduce:

      1. Create a VM connected to primary UDN
      2. Open a TCP session to a remote server
      3. Live migrate
      

      Actual results:

      The TCP session is broken

      Expected results:

      The TCP session stays open

      Additional info:

       

              phoracek@redhat.com Petr Horacek
              phoracek@redhat.com Petr Horacek
              Yossi Segev Yossi Segev
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: