-
Bug
-
Resolution: Not a Bug
-
Major
-
None
-
4.13.z, 4.14.z, 4.15
-
None
-
Moderate
-
No
-
Rejected
-
False
-
Description of problem:
Since OCP 4.13 we introduced a new feature in MetalLB giving the possibility to make L2 advertisements on additional interfaces, however this seems to have several limitations which is the reason I'm opening this ticket. During the tests I have been doing, it doesn't seem to exist an issue necessarily, but if that is the case, then we will need to move this ticket to documentation, since I think we should be more clear on what exactly customers can expect from this L2advertisement. Due to how OVN handles the traffic for LoadBalancer services, even with routing via host enabled, MetalLB with L2Adv on additional interfaces that are not part of the main node network, don't seem to work on more complex network and routing implementations that you can do on the OS level. With OCP 4.14 we introduced new features to handle symmetric routing, but this seems to be limited to the use of BGP. If this is true, then I think we should definitely have a note informing this to the customers, because not all customers have such protocol implemented in their network and/or for many customers is not worth it when you can have a simpler method to advertise Load Balancer IPs. But when using L2Adv on secondary interfaces it seems that connections will work for VLAN or directly connected routes. Otherwise the it either needs a lot of work in the infrastructure to achieve proper routing to and back for the respective service or it is simply not supported from our side. Even using egressService on OCP 4.14 to ensure that traffic will use the respective network on the respective routing table, when we do traceroute to the loadBalancerIp we get a response from the br-ex. This of course taking implying that everything in the node is configured for the secondary interfaces, like the example here [1]. When I did some captures I see the traffic arrives at the node and it is DNATted to the ClusterIP, but then just seems to get lost on its way out and the connections gets stuck in TCP retransmissions. [1] https://access.redhat.com/solutions/19596
Version-Release number of selected component (if applicable):
OCP 4.13+ with OVN-Kubernetes
How reproducible:
Depending on the network implementation.