-
Bug
-
Resolution: Unresolved
-
Blocker
-
None
-
None
-
1
-
False
-
-
False
-
rhel-sst-network-fastdatapath
-
-
-
-
ssg_networking
While testing ingress and ingress reply traffic, I can see that pings from only one gateway succeed. Replies in the broken case end up being ecmp routed to the wrong gateway:
gateway 1 - 02:42:ac:12:00:06, fc00:f853:ccd:e793::6
gateway 2 - 02:42:ac:12:00:07, fc00:f853:ccd:e793::7
pod IP - fd00:10:244:1::4
Topology:
gw1/gw2 ------- GR_ovn-control-plane---ovn_cluster_router-ovn-control-plane--pod
Working case (ping from gw1 -> pod):
[root@ovn-control-plane ~]# tcpdump -i eth0 host fd00:10:244:1::4 -nneev dropped privs to tcpdump tcpdump: listening on eth0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 13:58:19.492221 02:42:ac:12:00:06 > 02:42:ac:12:00:03, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x41606, hlim 64, next-header ICMPv6 (58) payload length: 64) fc00:f853:ccd:e793::6 > fd00:10:244:1::4: [icmp6 sum ok] ICMP6, echo request, id 17, seq 1 13:58:19.493063 02:42:ac:12:00:03 > 02:42:ac:12:00:06, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x670a8, hlim 62, next-header ICMPv6 (58) payload length: 64) fd00:10:244:1::4 > fc00:f853:ccd:e793::6: [icmp6 sum ok] ICMP6, echo reply, id 17, seq 1 13:58:20.514560 02:42:ac:12:00:06 > 02:42:ac:12:00:03, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x41606, hlim 64, next-header ICMPv6 (58) payload length: 64) fc00:f853:ccd:e793::6 > fd00:10:244:1::4: [icmp6 sum ok] ICMP6, echo request, id 17, seq 2 13:58:20.514991 02:42:ac:12:00:03 > 02:42:ac:12:00:06, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x670a8, hlim 62, next-header ICMPv6 (58) payload length: 64) fd00:10:244:1::4 > fc00:f853:ccd:e793::6: [icmp6 sum ok] ICMP6, echo reply, id 17, seq 2
Broken ping (gw2 ->pod):
13:58:40.500606 02:42:ac:12:00:07 > 02:42:ac:12:00:03, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x951de, hlim 64, next-header ICMPv6 (58) payload length: 64) fc00:f853:ccd:e793::7 > fd00:10:244:1::4: [icmp6 sum ok] ICMP6, echo request, id 18, seq 1 13:58:40.501430 02:42:ac:12:00:03 > 02:42:ac:12:00:06, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x8cc40, hlim 62, next-header ICMPv6 (58) payload length: 64) fd00:10:244:1::4 > fc00:f853:ccd:e793::7: [icmp6 sum ok] ICMP6, echo reply, id 18, seq 1 13:58:41.505562 02:42:ac:12:00:07 > 02:42:ac:12:00:03, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x951de, hlim 64, next-header ICMPv6 (58) payload length: 64) fc00:f853:ccd:e793::7 > fd00:10:244:1::4: [icmp6 sum ok] ICMP6, echo request, id 18, seq 2 13:58:41.506026 02:42:ac:12:00:03 > 02:42:ac:12:00:06, ethertype IPv6 (0x86dd), length 118: (flowlabel 0x8cc40, hlim 62, next-header ICMPv6 (58) payload length: 64) fd00:10:244:1::4 > fc00:f853:ccd:e793::7: [icmp6 sum ok] ICMP6, echo reply, id 18, seq 2
^reply being sent to wrong mac, gw1 mac
[root@ovn-control-plane ~]# ovn-nbctl lr-route-list GR_ovn-control-plane IPv6 Routes Route Table <main>: fd00:10:244:1::4 fc00:f853:ccd:e793::6 src-ip rtoe-GR_ovn-control-plane ecmp ecmp-symmetric-reply fd00:10:244:1::4 fc00:f853:ccd:e793::7 src-ip rtoe-GR_ovn-control-plane ecmp ecmp-symmetric-reply fd69::/125 fd69::4 dst-ip rtoe-GR_ovn-control-plane fd00:10:244::/48 fd98::1 dst-ip ::/0 fc00:f853:ccd:e793::1 dst-ip rtoe-GR_ovn-control-plane
Seen in latest ovn-kubernetes upstream with:
[root@ovn-control-plane ~]# rpm -qa | grep ovn
ovn-23.09.0-100.fc39.x86_64
ovn-central-23.09.0-100.fc39.x86_64
ovn-host-23.09.0-100.fc39.x86_64
ovn-vtep-23.09.0-100.fc39.x86_64
The bug may exist in previous versions.
- clones
-
FDP-358 Auto last hop behavior not working with ipv6 (ecmp-symmetric-reply)
- Verified