Uploaded image for project: 'OpenShift SDN'
  1. OpenShift SDN
  2. SDN-3732

Support live-migration of HyperShift VMs with OVN Kubernetes pod network

XMLWordPrintable

    • hypershift-vm-live-migration
    • False
    • None
    • False
    • Green
    • In Progress
    • CNV-25888 - [GA] Self-managed Hosted Control Planes support for the OpenShift Virtualization Provider
    • CNV-25888[GA] Self-managed Hosted Control Planes support for the OpenShift Virtualization Provider
    • 0% To Do, 0% In Progress, 100% Done
    • ---
    • 0
    • 0

      OCP/Telco Definition of Done
      Epic Template descriptions and documentation.

      Epic Goal

      • Support live-migration of HyperShift VMs connected to the pod network provided through OVN Kubernetes

      Why is this important?

      • HyperShift on KubeVirt should GA in 4.14. Worker nodes there run on KubeVirt VMs and are interconnected using OVN Kubernetes. These VMs must be able to live-migrate in case their current nodes need to undergo maintenance. With the current OVN Kubernetes implementation, live-migration leads into a new IP being assigned to the Pod hosting the VM and by that, breaking connectivity.

      Scenarios

      Stable IPs for migration:

      1. A Pod running a VM is given an IP by OVN Kubernetes
      2. The VM is being migrated to a new Pod running on a different Node
      3. The new Pod should obtain the same IP and use the same gateway IP

      IP negotiated using DHCP:

      1. A Pod running a VM is being started
      2. An IP is allocated for the Pod
      3. The IP is not set on an interface inside the netns
      4. The IP is offered through OVN LP's DHCP server

      Acceptance Criteria

      • CI - MUST be running successfully with tests automated
      • Release Technical Enablement - Provide necessary release enablement details and documents.
      • VM must be able to maintain an IP between migrations (from one Pod to another)
      • VM must be assigned an IP through DHCP, without the IP being set inside the Pod netns
      • The VM is a first class citizen of the Pod network, it can utilize NetworkPolices and Services, and communicate with other Pods
      • The IP seen within the VM should be reachable from other VMs and Pods

      Dependencies (internal and external)

      1. This depends on Proxy ARP being available in OVN. This work is unerway, tracked via BZ#2155306 and targeted for OVN 23.06.00
      2. Depends on kubernetes endpoitslices with same IP fix https://github.com/kubernetes/kubernetes/pull/116084
      3. Depends on new annotation at kubevirt to implement post-copy live migration correctly https://github.com/kubevirt/kubevirt/pull/9290

      Previous Work (Optional):

      1. Supporting this scenario using secondary OVN Kubernetes networks was considered, documented, PoC'd and presented to "OCP Networking Architecture" team.
      2. An alternative approach using the primary network was suggested by the team. This was documented, successfully PoC'd and presented back to the group.
      3. An Enhancement proposal was published.

      Open questions::

      Done Checklist

      • CI - CI is running, tests are automated and merged.
      • Release Enablement <link to Feature Enablement Presentation>
      • DEV - Upstream code and tests merged: <link to meaningful PR or GitHub Issue>
      • DEV - Upstream documentation merged: <link to meaningful PR or GitHub Issue>
      • DEV - Downstream build attached to advisory: <link to errata>
      • QE - Test plans in Polarion: <link or reference to Polarion>
      • QE - Automated tests merged: <link or reference to automated tests>
      • DOC - Downstream documentation merged: <link to meaningful PR>

        There are no Sub-Tasks for this issue.

            ellorent Felix Enrique Llorente Pastora
            phoracek@redhat.com Petr Horacek
            Jean Chen Jean Chen
            Laura Hinson Laura Hinson
            Votes:
            0 Vote for this issue
            Watchers:
            15 Start watching this issue

              Created:
              Updated:
              Resolved: