Loading...

Type: Bug
Resolution: Done
Priority: Normal
Fix Version/s: CNV vfuture
Affects Version/s: None
Component/s: CNV Network
Labels:
- cnv-4?
- cnvbugsm
- devel_ack?
- pm_ack?
- qa_ack?

Activity Type:
Quality / Stability / Reliability
Blocked:
False
Ready:
False
BZ Status:
CLOSED
BZ URL:
https://bugzilla.redhat.com/show_bug.cgi?id=1976578
Bugzilla Bug:
RHBZ: 1976578
Release Note Text:
Undefined

Severity:
Important

Regression:
None

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Created attachment 1795082 [details]
migration_vmb_new.yaml

Description of problem:
a migrated vm takes a lot of time (between 10 to 60 seconds) to gain connectivity.
when pinging over a secondary interface from the migrated vm (with multus) to another vm (with multus) in the same cluster, there is a packet loss (with 'Destination Host Unreachable') during this period of time.

Version-Release number of selected component (if applicable):
CNV v.4.8.0
OCP v.4.8.0-fc.5
Kubernetes Version: v1.21.0-rc.0+88a3e8c

How reproducible:
Not always. I couldn't find a correlation to understand why.

Steps to Reproduce:
1. create a dedicated namespace for the resources that will be created in the next steps. Name it "anat-test-migration" to match the namespace defined in the files attached.
2. create bridge (use 'migration_nncp_1.yaml' and 'migration_nncp_2.yaml' files attached - make sure to change the node selector to match your cluster nodes)
3. create nad (use 'migration_nad_new.yaml' file attached)
4. create vma and vmb (use 'migration_vma_new.yaml' and 'migration_vmb_new.yaml' files attached).
5. run both VM's:
$ virtctl start vma
$ virtctl start vmb
6. expose services to allow ssh connection to both vms (use 'migration_ssh_service_for_vma.yaml' and 'migration_ssh_service_for_vmb.yaml' files attached).
7. migrate vmb (use 'migration_virtualmachineinstancemigration.yaml' file attached).
8. connect to vmb as soon as the migration finishes. To find the exact moment you can check when the vmi is assigned a new IP address using the command:
$ oc get vmi -w
9. ping from vmb to vma over the secondary interface (bridge):

enter VM vmb through ssh (the IP is the ip of the node on which vmb is running, '-p' is the port of the vmb's service which can be found using the command 'oc get service'):
$ ssh fedora@192.168.2.83 -p 30401
ping vma:
$ ping 10.200.0.1

in order to reproduce, steps 8 and 9 should be performed as close to the migration ending as possible.

Actual results:
when bug occurs:
[fedora@vmb ~]$ ping 10.200.0.1
PING 10.200.0.1 (10.200.0.1) 56(84) bytes of data.
From 10.200.0.22 icmp_seq=10 Destination Host Unreachable
From 10.200.0.22 icmp_seq=11 Destination Host Unreachable
From 10.200.0.22 icmp_seq=12 Destination Host Unreachable
From 10.200.0.22 icmp_seq=13 Destination Host Unreachable
From 10.200.0.22 icmp_seq=14 Destination Host Unreachable
From 10.200.0.22 icmp_seq=15 Destination Host Unreachable
From 10.200.0.22 icmp_seq=16 Destination Host Unreachable
From 10.200.0.22 icmp_seq=17 Destination Host Unreachable
From 10.200.0.22 icmp_seq=18 Destination Host Unreachable
From 10.200.0.22 icmp_seq=19 Destination Host Unreachable
From 10.200.0.22 icmp_seq=20 Destination Host Unreachable
From 10.200.0.22 icmp_seq=21 Destination Host Unreachable
From 10.200.0.22 icmp_seq=22 Destination Host Unreachable
From 10.200.0.22 icmp_seq=23 Destination Host Unreachable
From 10.200.0.22 icmp_seq=24 Destination Host Unreachable
From 10.200.0.22 icmp_seq=25 Destination Host Unreachable
From 10.200.0.22 icmp_seq=26 Destination Host Unreachable
From 10.200.0.22 icmp_seq=27 Destination Host Unreachable
From 10.200.0.22 icmp_seq=28 Destination Host Unreachable
From 10.200.0.22 icmp_seq=29 Destination Host Unreachable
From 10.200.0.22 icmp_seq=30 Destination Host Unreachable
From 10.200.0.22 icmp_seq=31 Destination Host Unreachable
From 10.200.0.22 icmp_seq=32 Destination Host Unreachable
From 10.200.0.22 icmp_seq=33 Destination Host Unreachable
64 bytes from 10.200.0.1: icmp_seq=35 ttl=64 time=3.93 ms
64 bytes from 10.200.0.1: icmp_seq=34 ttl=64 time=1028 ms
64 bytes from 10.200.0.1: icmp_seq=36 ttl=64 time=1.36 ms
64 bytes from 10.200.0.1: icmp_seq=37 ttl=64 time=0.962 ms
64 bytes from 10.200.0.1: icmp_seq=38 ttl=64 time=1.30 ms
^C
— 10.200.0.1 ping statistics —
38 packets transmitted, 5 received, +24 errors, 86.8421% packet loss, time 37808ms
rtt min/avg/max/mdev = 0.962/207.169/1028.296/410.564 ms, pipe 4

Expected results:
no packet loss.

Additional info:
tcpdump of the secondary interface of the migrated VM (vmb) is included - steps to produce:
1. ssh to vmb:
$ ssh fedora@192.168.2.83 -p 30401
2. run tcpdump:
$ sudo tcpdump -i eth1 -xx >~/tcpdump_log.log

external trackers

Red Hat Issue Tracker CNV-12654

Red Hat Issue Tracker CNV-17884

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates