-
Bug
-
Resolution: Done
-
Undefined
-
None
-
None
-
False
-
None
-
False
-
---
-
-
-
0
-
0
When a virtual machine is live migrated the connection stablishment from pods at the original node switch to the VM takes more than expected.
The root cause is that those pods neighbours cache are pointing to the VMs mac address but that mac address is no longer accessible since is not parth of the node switch anymore, so that traffic works after those pods invalidate their cache and send and arp so the arp proxy MAC is returned and traffic work as expected.
The solution involves sending a GARP after live migration with the arp proxy mac over the management port at that switch so clients neighbours cache get updated, one issue with that is that the LSP from the migrated VM take some time after live migration success so the GARP has to be periodically send until that LSP disappear.
We have also found that OVN is answering back the GARP's with Request operation type so they are no broadcasted to ports, we have open a bug there
https://issues.redhat.com/browse/FDP-626
Also the poc to test this is here
https://github.com/ovn-org/ovn-kubernetes/pull/4365