-
Bug
-
Resolution: Not a Bug
-
Undefined
-
None
-
None
-
None
-
False
-
-
False
-
-
When deploying ovn-kubernetes upstream, we have a traffic path where packets destined to the pod IP arrive at the node, then are forwarded into OVN where they eventually make it into the pod:
external client-->eth0-br-ex->GR--->ovn_cluster_router—>pod
When sending a ping to the pod IP, the reply comes back as usual with shared gateway mode. However, with local gateway mode, the reply path is asymmetric, where the reply path goes into ovn-k8s-mp0 and then gets routed by the kernel networking stack. This reply path with local gateway mode is not working correctly. The packet gets dropped in OVS (port 7 is the server pod, reply here is an icmp reply to the external client 172.18.0.5):
recirc_id(0xd5),in_port(7),skb_mark(0),ct_state(-new+est-rel+rpl-inv+trk-dnat),ct_mark(0/0xf),eth(src=0a:58:0a:f4:01:04,dst=0a:58:0a:f4:01:01),eth_type(0x0800),ipv4(src=10.244.1.4/255.255.255.252,dst=172.18.0.5,proto=1,ttl=64,frag=no), packets:288, bytes:28224, used:0.274s, actions:ct_clear,set(eth(src=0a:58:0a:f4:01:01,dst=2e:7b:c5:1b:99:1b)),set(ipv4(ttl=63)),ct(zone=15,nat),recirc(0xd6) recirc_id(0xd6),in_port(7),ct_state(-new-est-rpl+inv+trk),ct_mark(0/0x1),eth(),eth_type(0x0800),ipv4(frag=no), packets:288, bytes:28224, used:0.274s, actions:drop
For some reason the packet is marked as invalid and dropped. When I look at openflow flows, I see this flow incrementing:
cookie=0xfeefd108, duration=19019.107s, table=79, n_packets=17992, n_bytes=1762952, priority=100,ip,reg14=0x1,metadata=0x4,dl_src=02:42:ac:12:00:05,nw_src=172.18.0.5 actions=drop
This is odd because the match criteria here is src=172.18.0.5, when the reply packet is destined to 172.18.0.5. Also, I cannot find a flow in the OVN database with the above cookie.
Attempting TCP instead of ICMP makes no difference. I did an ovn-trace for a reply and it thinks it should be sent as expected to ovn-k8s-mp0:
[root@ovn-worker ~]# ovn-trace --ct trk,rpl --ct trk,rpl ovn-worker 'inport == "default_client" && eth.src == 0a:58:0a:f4:01:04 && eth.dst== 0a:58:0a:f4:01:01 && ip4 && ip.ttl==64 && ip4.src==10.244.1.4 && ip4.dst==172.18.0.5 && tcp && tcp.src == 5555 && tcp.dst ==31512' # tcp,reg14=0x4,vlan_tci=0x0000,dl_src=0a:58:0a:f4:01:04,dl_dst=0a:58:0a:f4:01:01,nw_src=10.244.1.4,nw_dst=172.18.0.5,nw_tos=0,nw_ecn=0,nw_ttl=64,nw_frag=no,tp_src=5555,tp_dst=31512,tcp_flags=0ingress(dp="ovn-worker", inport="default_client") ------------------------------------------------- 0. ls_in_check_port_sec (northd.c:8691): 1, priority 50, uuid 55ca4101 reg0[15] = check_in_port_sec(); next; 4. ls_in_pre_acl (northd.c:5997): ip, priority 100, uuid 33376269 reg0[0] = 1; next; 5. ls_in_pre_lb (northd.c:6203): ip, priority 100, uuid dd2f4859 reg0[2] = 1; next; 6. ls_in_pre_stateful (northd.c:6231): reg0[2] == 1, priority 110, uuid 874ed2f7 ct_lb_mark;ct_lb_mark ---------- 7. ls_in_acl_hint (northd.c:6343): !ct.est, priority 3, uuid cd127c76 reg0[9] = 1; next; 8. ls_in_acl_eval (northd.c:6897): ip && !ct.est, priority 1, uuid be38d6a3 reg0[1] = 1; next; 9. ls_in_acl_action (northd.c:6725): reg8[30..31] == 0, priority 500, uuid bfa67f2b reg8[30..31] = 1; next(8); 8. ls_in_acl_eval (northd.c:6897): ip && !ct.est, priority 1, uuid be38d6a3 reg0[1] = 1; next; 9. ls_in_acl_action (northd.c:6725): reg8[30..31] == 1, priority 500, uuid 0c7a9548 reg8[30..31] = 2; next(8); 8. ls_in_acl_eval (northd.c:6897): ip && !ct.est, priority 1, uuid be38d6a3 reg0[1] = 1; next; 9. ls_in_acl_action (northd.c:6714): 1, priority 0, uuid 44d6fced reg8[16] = 0; reg8[17] = 0; reg8[18] = 0; reg8[30..31] = 0; next; 15. ls_in_pre_hairpin (northd.c:7778): ip && ct.trk, priority 100, uuid 828bbde6 reg0[6] = chk_lb_hairpin(); reg0[12] = chk_lb_hairpin_reply(); next; 19. ls_in_acl_after_lb_action (northd.c:6725): reg8[30..31] == 0, priority 500, uuid b66f9971 reg8[30..31] = 1; next(18); 19. ls_in_acl_after_lb_action (northd.c:6725): reg8[30..31] == 1, priority 500, uuid f9d4de79 reg8[30..31] = 2; next(18); 19. ls_in_acl_after_lb_action (northd.c:6714): 1, priority 0, uuid fa194d0b reg8[16] = 0; reg8[17] = 0; reg8[18] = 0; reg8[30..31] = 0; next; 20. ls_in_stateful (northd.c:7744): reg0[1] == 1 && reg0[13] == 0, priority 100, uuid 6530cf00 ct_commit { ct_mark.blocked = 0; }; next; 27. ls_in_l2_lkup (northd.c:9564): eth.dst == { 0a:58:a9:fe:01:01, 0a:58:0a:f4:01:01 }, priority 50, uuid 3f1f3474 outport = "stor-ovn-worker"; output;egress(dp="ovn-worker", inport="default_client", outport="stor-ovn-worker") --------------------------------------------------------------------------- 0. ls_out_pre_acl (northd.c:5856): ip && outport == "stor-ovn-worker", priority 110, uuid dc513d3d next; 1. ls_out_pre_lb (northd.c:5856): ip && outport == "stor-ovn-worker", priority 110, uuid 3f6f9b01 next; 3. ls_out_acl_hint (northd.c:6343): !ct.est, priority 3, uuid d9ba06bf reg0[9] = 1; next; 4. ls_out_acl_eval (northd.c:6899): ip && !ct.est, priority 1, uuid 786efa19 reg0[1] = 1; next; 5. ls_out_acl_action (northd.c:6725): reg8[30..31] == 0, priority 500, uuid 1775d2c5 reg8[30..31] = 1; next(4); 4. ls_out_acl_eval (northd.c:6899): ip && !ct.est, priority 1, uuid 786efa19 reg0[1] = 1; next; 5. ls_out_acl_action (northd.c:6725): reg8[30..31] == 1, priority 500, uuid babb34d1 reg8[30..31] = 2; next(4); 4. ls_out_acl_eval (northd.c:6899): ip && !ct.est, priority 1, uuid 786efa19 reg0[1] = 1; next; 5. ls_out_acl_action (northd.c:6714): 1, priority 0, uuid dc22dbff reg8[16] = 0; reg8[17] = 0; reg8[18] = 0; reg8[30..31] = 0; next; 8. ls_out_stateful (northd.c:7749): reg0[1] == 1 && reg0[13] == 0, priority 100, uuid 2a0804c1 ct_commit { ct_mark.blocked = 0; }; next; 9. ls_out_check_port_sec (northd.c:5817): 1, priority 0, uuid 953bd0cd reg0[15] = check_out_port_sec(); next; 10. ls_out_apply_port_sec (northd.c:5824): 1, priority 0, uuid 650d998f output; /* output to "stor-ovn-worker", type "patch" */ingress(dp="ovn_cluster_router", inport="rtos-ovn-worker") ---------------------------------------------------------- 0. lr_in_admission (northd.c:12103): eth.dst == { 0a:58:a9:fe:01:01, 0a:58:0a:f4:01:01 } && inport == "rtos-ovn-worker" && is_chassis_resident("cr-rtos-ovn-worker"), priority 50, uuid 6f057d03 xreg0[0..47] = 0a:58:0a:f4:01:01; next; 1. lr_in_lookup_neighbor (northd.c:12291): 1, priority 0, uuid af244947 reg9[2] = 1; next; 2. lr_in_learn_neighbor (northd.c:12301): reg9[2] == 1 || reg9[3] == 0, priority 100, uuid f44f3d31 mac_cache_use; next; 12. lr_in_ip_routing_pre (northd.c:12544): 1, priority 0, uuid f82f476a reg7 = 0; next; 13. lr_in_ip_routing (northd.c:10811): ip4.src == 10.244.1.0/24, priority 72, uuid 394c6edc ip.ttl--; reg8[0..15] = 0; reg0 = 10.244.1.2; reg1 = 10.244.1.1; eth.src = 0a:58:0a:f4:01:01; outport = "rtos-ovn-worker"; flags.loopback = 1; next; 14. lr_in_ip_routing_ecmp (northd.c:12602): reg8[0..15] == 0, priority 150, uuid c14c9ac9 next; 15. lr_in_policy (northd.c:12779): 1, priority 0, uuid 5607adfa reg8[0..15] = 0; next; 16. lr_in_policy_ecmp (northd.c:12782): reg8[0..15] == 0, priority 150, uuid 727b506a next; 17. lr_in_arp_resolve (northd.c:12968): outport == "rtos-ovn-worker" && reg0 == 10.244.1.2, priority 100, uuid 89ec408c eth.dst = 2e:7b:c5:1b:99:1b; next; 20. lr_in_gw_redirect (northd.c:13301): outport == "rtos-ovn-worker", priority 50, uuid 41c57670 outport = "cr-rtos-ovn-worker"; next; 21. lr_in_arp_request (northd.c:13457): 1, priority 0, uuid de051889 output; /* Replacing type "chassisredirect" outport "cr-rtos-ovn-worker" with distributed port "rtos-ovn-worker". */egress(dp="ovn_cluster_router", inport="rtos-ovn-worker", outport="rtos-ovn-worker") ------------------------------------------------------------------------------------ 0. lr_out_chk_dnat_local (northd.c:14922): 1, priority 0, uuid ed20416d reg9[4] = 0; next; 6. lr_out_delivery (northd.c:13506): outport == "rtos-ovn-worker", priority 100, uuid 2c22ef43 output; /* output to "rtos-ovn-worker", type "patch" */ingress(dp="ovn-worker", inport="stor-ovn-worker") -------------------------------------------------- 0. ls_in_check_port_sec (northd.c:8691): 1, priority 50, uuid 55ca4101 reg0[15] = check_in_port_sec(); next; 4. ls_in_pre_acl (northd.c:5853): ip && inport == "stor-ovn-worker", priority 110, uuid 0d2022d5 next; 5. ls_in_pre_lb (northd.c:5853): ip && inport == "stor-ovn-worker", priority 110, uuid f5b2ab6f next; 7. ls_in_acl_hint (northd.c:6343): !ct.est, priority 3, uuid cd127c76 reg0[9] = 1; next; 8. ls_in_acl_eval (northd.c:6897): ip && !ct.est, priority 1, uuid be38d6a3 reg0[1] = 1; next; 9. ls_in_acl_action (northd.c:6725): reg8[30..31] == 0, priority 500, uuid bfa67f2b reg8[30..31] = 1; next(8); 8. ls_in_acl_eval (northd.c:6897): ip && !ct.est, priority 1, uuid be38d6a3 reg0[1] = 1; next; 9. ls_in_acl_action (northd.c:6725): reg8[30..31] == 1, priority 500, uuid 0c7a9548 reg8[30..31] = 2; next(8); 8. ls_in_acl_eval (northd.c:6897): ip && !ct.est, priority 1, uuid be38d6a3 reg0[1] = 1; next; 9. ls_in_acl_action (northd.c:6714): 1, priority 0, uuid 44d6fced reg8[16] = 0; reg8[17] = 0; reg8[18] = 0; reg8[30..31] = 0; next; 15. ls_in_pre_hairpin (northd.c:7778): ip && ct.trk, priority 100, uuid 828bbde6 reg0[6] = chk_lb_hairpin(); reg0[12] = chk_lb_hairpin_reply(); next; 19. ls_in_acl_after_lb_action (northd.c:6725): reg8[30..31] == 0, priority 500, uuid b66f9971 reg8[30..31] = 1; next(18); 19. ls_in_acl_after_lb_action (northd.c:6725): reg8[30..31] == 1, priority 500, uuid f9d4de79 reg8[30..31] = 2; next(18); 19. ls_in_acl_after_lb_action (northd.c:6714): 1, priority 0, uuid fa194d0b reg8[16] = 0; reg8[17] = 0; reg8[18] = 0; reg8[30..31] = 0; next; 20. ls_in_stateful (northd.c:7744): reg0[1] == 1 && reg0[13] == 0, priority 100, uuid 6530cf00 ct_commit { ct_mark.blocked = 0; }; next; 27. ls_in_l2_lkup (northd.c:9487): eth.dst == 2e:7b:c5:1b:99:1b, priority 50, uuid e6db67e6 outport = "k8s-ovn-worker"; output;egress(dp="ovn-worker", inport="stor-ovn-worker", outport="k8s-ovn-worker") --------------------------------------------------------------------------- 0. ls_out_pre_acl (northd.c:6000): ip, priority 100, uuid 7902348e reg0[0] = 1; next; 1. ls_out_pre_lb (northd.c:6206): ip, priority 100, uuid 96562b29 reg0[2] = 1; next; 2. ls_out_pre_stateful (northd.c:6235): reg0[2] == 1, priority 110, uuid a9050ff5 ct_lb_mark;ct_lb_mark ---------- 3. ls_out_acl_hint (northd.c:6343): !ct.est, priority 3, uuid d9ba06bf reg0[9] = 1; next; 4. ls_out_acl_eval (northd.c:6899): ip && !ct.est, priority 1, uuid 786efa19 reg0[1] = 1; next; 5. ls_out_acl_action (northd.c:6725): reg8[30..31] == 0, priority 500, uuid 1775d2c5 reg8[30..31] = 1; next(4); 4. ls_out_acl_eval (northd.c:6899): ip && !ct.est, priority 1, uuid 786efa19 reg0[1] = 1; next; 5. ls_out_acl_action (northd.c:6725): reg8[30..31] == 1, priority 500, uuid babb34d1 reg8[30..31] = 2; next(4); 4. ls_out_acl_eval (northd.c:6899): ip && !ct.est, priority 1, uuid 786efa19 reg0[1] = 1; next; 5. ls_out_acl_action (northd.c:6714): 1, priority 0, uuid dc22dbff reg8[16] = 0; reg8[17] = 0; reg8[18] = 0; reg8[30..31] = 0; next; 8. ls_out_stateful (northd.c:7749): reg0[1] == 1 && reg0[13] == 0, priority 100, uuid 2a0804c1 ct_commit { ct_mark.blocked = 0; }; next; 9. ls_out_check_port_sec (northd.c:5817): 1, priority 0, uuid 953bd0cd reg0[15] = check_out_port_sec(); next; 10. ls_out_apply_port_sec (northd.c:5824): 1, priority 0, uuid 650d998f output; /* output to "k8s-ovn-worker", type "" */
- roto trace shows the same f
- inal output to k8s-ovn-worker. Conntrack:
root@ovn-worker:/# conntrack -L | grep icmp icmp 1 29 src=172.18.0.5 dst=10.244.1.4 type=8 code=0 id=8 src=10.244.1.4 dst=172.18.0.5 type=0 code=0 id=8 mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=20 use=1 icmp 1 29 src=172.18.0.5 dst=10.244.1.4 type=8 code=0 id=8 [UNREPLIED] src=10.244.1.4 dst=172.18.0.5 type=0 code=0 id=8 mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=2 use=1
zones are:
GR_ovn-worker_dnat 2
default_client 20
default_client here is really the pod server.
Steps to reproduce:
1. checkout latest ovnk master
2. ./kind.sh -ic -ds -gm local
3. docker run --network=kind --privileged -it fedora /bin/bash
4. Create a pod.
5. In the docker container add a route for the pod subnet to the node where the pod is running.
6. Ping from the docker container to the pod and observe the reply being dropped.