-
Bug
-
Resolution: Done-Errata
-
Critical
-
rhos-17.1.5
-
5
-
False
-
-
False
-
No Docs Impact
-
openstack-neutron-18.6.1-17.1.20250307151002.85ff760.el9ost
-
None
-
-
-
Approved
-
Neutron Sprint 10
-
1
-
Important
When BGP is configured with `expose_tenant_networks` enabled, the routes to the tenant network IPs from VM instances are exposed via BGP from the controller/networker node where the router external GW port is scheduled.
The test test_dvr_vip_failover fails with the new compose RHOS-17.1-RHEL-9-20250325.n.1 and it passes with previous composes. The failure happens at the end of the test (see test logs [1] and test code [2]), when it checks how traffic is routed towards a VIP attached to a VM instance, using no FIP (all IPs involved are tenant IPs).
This compose uses neutron rpms from [3].
When the issue is reproduced, the packets reach the compute node through the geneve tunnel, but the egress packets are routed out through the external NIC enp2s0, instead of the geneve tunnel (VIP is 192.168.0.118 in this test)
[root@cmp-2-0 ~]# tcpdump -vnne -i any tcp port 22 and host 4.1.1.1 and host 192.168.0.118 tcpdump: data link type LINUX_SLL2 dropped privs to tcpdump tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 15:56:03.033180 genev_sys_6081 P ifindex 10 fa:16:3e:b4:ff:1f ethertype IPv4 (0x0800), length 80: (tos 0x48, ttl 59, id 52140, offset 0, flags [DF], proto TCP (6), length 60) 4.1.1.1.57278 > 192.168.0.118.22: Flags [S], cksum 0xec9e (correct), seq 2631689354, win 64240, options [mss 1460,sackOK,TS val 1635080095 ecr 0,nop,wscale 7], length 0 15:56:03.033393 tap65495fec-5e Out ifindex 16 fa:16:3e:b4:ff:1f ethertype IPv4 (0x0800), length 80: (tos 0x48, ttl 59, id 52140, offset 0, flags [DF], proto TCP (6), length 60) 4.1.1.1.57278 > 192.168.0.118.22: Flags [S], cksum 0xec9e (correct), seq 2631689354, win 64240, options [mss 1460,sackOK,TS val 1635080095 ecr 0,nop,wscale 7], length 0 15:56:03.033525 tap65495fec-5e P ifindex 16 fa:16:3e:c0:5b:0a ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.118.22 > 4.1.1.1.57278: Flags [S.], cksum 0xc64e (incorrect -> 0xb447), seq 2506190744, ack 2631689355, win 27800, options [mss 1402,sackOK,TS val 515744545 ecr 1635080095,nop,wscale 7], length 0 15:56:03.033785 br-ex In ifindex 7 fa:16:3e:ad:c0:a7 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.118.22 > 4.1.1.1.57278: Flags [S.], cksum 0xb447 (correct), seq 2506190744, ack 2631689355, win 27800, options [mss 1402,sackOK,TS val 515744545 ecr 1635080095,nop,wscale 7], length 0 15:56:03.033805 enp2s0 Out ifindex 3 52:54:00:4a:71:85 ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 62, id 0, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.118.22 > 4.1.1.1.57278: Flags [S.], cksum 0xb447 (correct), seq 2506190744, ack 2631689355, win 27800, options [mss 1402,sackOK,TS val 515744545 ecr 1635080095,nop,wscale 7], length 0
The expected behavior is that the egress traffic is routed through the geneve tunnel as well (VIP is 192.168.0.38 in this case):
[root@cmp-3-0 ~]# tcpdump -vnne -i any tcp port 22 and host 4.1.1.1 and host 192.168.0.38 tcpdump: data link type LINUX_SLL2 dropped privs to tcpdump tcpdump: listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes 16:19:39.586687 genev_sys_6081 P ifindex 10 fa:16:3e:94:3e:3a ethertype IPv4 (0x0800), length 80: (tos 0x48, ttl 59, id 49298, offset 0, flags [DF], proto TCP (6), length 60) 4.1.1.1.44992 > 192.168.0.38.22: Flags [S], cksum 0x291e (correct), seq 2989048609, win 64240, options [mss 1460,sackOK,TS val 1602902389 ecr 0,nop,wscale 7], length 0 16:19:39.586948 tap22d74330-65 Out ifindex 15 fa:16:3e:94:3e:3a ethertype IPv4 (0x0800), length 80: (tos 0x48, ttl 59, id 49298, offset 0, flags [DF], proto TCP (6), length 60) 4.1.1.1.44992 > 192.168.0.38.22: Flags [S], cksum 0x291e (correct), seq 2989048609, win 64240, options [mss 1460,sackOK,TS val 1602902389 ecr 0,nop,wscale 7], length 0 16:19:39.587111 tap22d74330-65 P ifindex 15 fa:16:3e:a6:d7:ce ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.38.22 > 4.1.1.1.44992: Flags [S.], cksum 0xc5fe (incorrect -> 0x29d5), seq 406169368, ack 2989048610, win 27800, options [mss 1402,sackOK,TS val 2039436525 ecr 1602902389,nop,wscale 7], length 0 16:19:39.587326 genev_sys_6081 Out ifindex 10 fa:16:3e:23:c6:dc ethertype IPv4 (0x0800), length 80: (tos 0x0, ttl 63, id 0, offset 0, flags [DF], proto TCP (6), length 60) 192.168.0.38.22 > 4.1.1.1.44992: Flags [S.], cksum 0x29d5 (correct), seq 406169368, ack 2989048610, win 27800, options [mss 1402,sackOK,TS val 2039436525 ecr 1602902389,nop,wscale 7], length 0
It seems the following patch [4] is needed on 17.1 as well.
We have tested it and the test passes.
- links to
-
RHBA-2025:3478 Red Hat OpenStack Platform 17.1.5 bug fix and enhancement advisory
- mentioned on