Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-17884

[1976578] IP connectivity is lost after migration (with multus)

XMLWordPrintable

    • Important
    • None

      Created attachment 1795082 [details]
      migration_vmb_new.yaml

      Created attachment 1795082 [details]
      migration_vmb_new.yaml

      Description of problem:
      a migrated vm takes a lot of time (between 10 to 60 seconds) to gain connectivity.
      when pinging over a secondary interface from the migrated vm (with multus) to another vm (with multus) in the same cluster, there is a packet loss (with 'Destination Host Unreachable') during this period of time.

      Version-Release number of selected component (if applicable):
      CNV v.4.8.0
      OCP v.4.8.0-fc.5
      Kubernetes Version: v1.21.0-rc.0+88a3e8c

      How reproducible:
      Not always. I couldn't find a correlation to understand why.

      Steps to Reproduce:
      1. create a dedicated namespace for the resources that will be created in the next steps. Name it "anat-test-migration" to match the namespace defined in the files attached.
      2. create bridge (use 'migration_nncp_1.yaml' and 'migration_nncp_2.yaml' files attached - make sure to change the node selector to match your cluster nodes)
      3. create nad (use 'migration_nad_new.yaml' file attached)
      4. create vma and vmb (use 'migration_vma_new.yaml' and 'migration_vmb_new.yaml' files attached).
      5. run both VM's:
      $ virtctl start vma
      $ virtctl start vmb
      6. expose services to allow ssh connection to both vms (use 'migration_ssh_service_for_vma.yaml' and 'migration_ssh_service_for_vmb.yaml' files attached).
      7. migrate vmb (use 'migration_virtualmachineinstancemigration.yaml' file attached).
      8. connect to vmb as soon as the migration finishes. To find the exact moment you can check when the vmi is assigned a new IP address using the command:
      $ oc get vmi -w
      9. ping from vmb to vma over the secondary interface (bridge):

      • enter VM vmb through ssh (the IP is the ip of the node on which vmb is running, '-p' is the port of the vmb's service which can be found using the command 'oc get service'):
        $ ssh fedora@192.168.2.83 -p 30401
      • ping vma:
        $ ping 10.200.0.1
      • in order to reproduce, steps 8 and 9 should be performed as close to the migration ending as possible.

      Actual results:
      when bug occurs:
      [fedora@vmb ~]$ ping 10.200.0.1
      PING 10.200.0.1 (10.200.0.1) 56(84) bytes of data.
      From 10.200.0.22 icmp_seq=10 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=11 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=12 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=13 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=14 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=15 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=16 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=17 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=18 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=19 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=20 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=21 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=22 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=23 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=24 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=25 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=26 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=27 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=28 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=29 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=30 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=31 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=32 Destination Host Unreachable
      From 10.200.0.22 icmp_seq=33 Destination Host Unreachable
      64 bytes from 10.200.0.1: icmp_seq=35 ttl=64 time=3.93 ms
      64 bytes from 10.200.0.1: icmp_seq=34 ttl=64 time=1028 ms
      64 bytes from 10.200.0.1: icmp_seq=36 ttl=64 time=1.36 ms
      64 bytes from 10.200.0.1: icmp_seq=37 ttl=64 time=0.962 ms
      64 bytes from 10.200.0.1: icmp_seq=38 ttl=64 time=1.30 ms
      ^C
      — 10.200.0.1 ping statistics —
      38 packets transmitted, 5 received, +24 errors, 86.8421% packet loss, time 37808ms
      rtt min/avg/max/mdev = 0.962/207.169/1028.296/410.564 ms, pipe 4

      Expected results:
      no packet loss.

      Additional info:
      tcpdump of the secondary interface of the migrated VM (vmb) is included - steps to produce:
      1. ssh to vmb:
      $ ssh fedora@192.168.2.83 -p 30401
      2. run tcpdump:
      $ sudo tcpdump -i eth1 -xx >~/tcpdump_log.log

              jira-bugzilla-migration RH Bugzilla Integration
              rh-ee-awax Anat Wax
              Meni Yakove Meni Yakove
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: