-
Bug
-
Resolution: Done-Errata
-
Normal
-
None
While working on a fragmentation test case I ran into an apparent OVN issue.
Disclaimer: different excerpts have different details (IPs, ports) because they correspond to different test runs but can be extrapolated to the same issue
Reaching out to a node port from outside the cluster ended up with an incorrect NAT sequence:
udp 17 27 src=172.18.0.5 dst=172.18.0.4 sport=44809 dport=31791 [UNREPLIED] src=10.244.1.11 dst=172.18.0.5 sport=90 dport=44809 mark=10 secctx=system_u:object_r:unlabeled_t:s0 zone=2 use=1 udp 17 27 src=172.18.0.5 dst=10.244.1.11 sport=44809 dport=90 src=10.244.1.11 dst=172.18.0.5 sport=90 dport=29698 mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=2 use=1 udp 17 27 src=172.18.0.5 dst=10.244.1.11 sport=29698 dport=90 src=10.244.1.11 dst=100.64.0.3 sport=90 dport=29698 mark=0 secctx=system_u:object_r:unlabeled_t:s0 use=1 udp 17 27 src=100.64.0.3 dst=10.244.1.11 sport=29698 dport=90 src=10.244.1.11 dst=100.64.0.3 sport=90 dport=29698 mark=0 secctx=system_u:object_r:unlabeled_t:s0 zone=19 use=1 udp 17 27 src=10.244.1.11 dst=172.18.0.5 sport=90 dport=44809 [UNREPLIED] src=172.18.0.5 dst=10.244.1.11 sport=44809 dport=90 mark=1 secctx=system_u:object_r:unlabeled_t:s0 zone=64000 use=1
The request is committed twice to conntrack zone 2 (the GR zone) and as a result the source port is replaced without any existing conflict on that source port. The reply however, does not go through conntrack zone 2 twice as the request so it ends up not being unDNATed and the packet egresses with an incorrect source IP.
This just happens the first time the test case is executed, successive times succeed, and is only after a while that the problem happens again.
Observing ovn-controller logs, there is a packet in for the request, which triggers an ARP sequence to learn a MAC address, before the original request is re-injected with a packet out:
2024-03-01T13:11:59.022Z|01342|vconn(ovn_pinctrl0)|DBG|unix:/var/run/openvswitch/br-int.mgmt: received: NXT_PACKET_IN2 (OF1.5) (xid=0x0): table_id=29 cookie=0x6c50f9f6 total_len=1268 ct_state=new|trk|snat|dnat,ct_zone=2,ct_mark=0xa,ct_nw_src=172.18.0.5,ct_nw_dst=172.18.0.2,ct_nw_proto=17,ct_tp_src=48588,ct_tp_dst=30189,ip,reg0=0x64400001,reg1=0x64400003,reg9=0x4,reg10=0x9,reg11=0x2,reg13=0x3,reg14=0x1,reg15=0x2,metadata=0x4,in_port=4 (via action) data_len=1268 (unbuffered) userdata=00.00.00.00.00.00.00.00.00.19.00.10.80.00.06.06.ff.ff.ff.ff.ff.ff.00.00.00.1c.00.18.00.20.00.40.00.00.00.00.00.01.de.10.80.00.2c.04.00.00.00.00.00.1c.00.18.00.20.00.60.00.00.00.00.00.01.de.10.80.00.2e.04.00.00.00.00.00.19.00.10.80.00.2a.02.00.01.00.00.00.00.00.00.ff.ff.00.10.00.00.23.20.00.0e.ff.f8.25.00.00.00 udp,vlan_tci=0x0000,dl_src=0a:58:64:40:00:03,dl_dst=00:00:00:00:00:00,nw_src=172.18.0.5,nw_dst=10.244.1.9,nw_tos=0,nw_ecn=0,nw_ttl=63,nw_frag=no,tp_src=48588,tp_dst=90 udp_csum:be5f 2024-03-01T13:11:59.023Z|01343|vconn(ovn_pinctrl0)|DBG|unix:/var/run/openvswitch/br-int.mgmt: sent (Success): OFPT_PACKET_OUT (OF1.5) (xid=0x2414): in_port=CONTROLLER actions=set_field:0x64400001->reg0,set_field:0x64400003->reg1,set_field:0x4->reg9,set_field:0x9->reg10,set_field:0x2->reg11,set_field:0x3->reg13,set_field:0x1->reg14,set_field:0x2->reg15,set_field:0x4->metadata,set_field:ff:ff:ff:ff:ff:ff->eth_dst,move:NXM_NX_XXREG0[64..95]->NXM_OF_ARP_SPA[],move:NXM_NX_XXREG0[96..127]->NXM_OF_ARP_TPA[],set_field:1->arp_op,resubmit(,37) data_len=42 arp,vlan_tci=0x0000,dl_src=0a:58:64:40:00:03,dl_dst=00:00:00:00:00:00,arp_spa=172.18.0.5,arp_tpa=10.244.1.9,arp_op=1,arp_sha=0a:58:64:40:00:03,arp_tha=00:00:00:00:00:00 2024-03-01T13:11:59.023Z|01344|vconn(ovn_pinctrl0)|DBG|unix:/var/run/openvswitch/br-int.mgmt: received: NXT_PACKET_IN2 (OF1.5) (xid=0x0): table_id=10 cookie=0x7622d0f9 total_len=42 reg0=0x64400001,reg1=0x64400003,reg9=0x8,reg11=0x2,reg13=0x7,reg14=0x2,metadata=0x4,in_port=ANY (via action) data_len=42 (unbuffered) userdata=00.00.00.01.00.00.00.00 arp,vlan_tci=0x0000,dl_src=0a:58:64:40:00:01,dl_dst=0a:58:64:40:00:03,arp_spa=100.64.0.1,arp_tpa=100.64.0.3,arp_op=2,arp_sha=0a:58:64:40:00:01,arp_tha=0a:58:64:40:00:03 2024-03-01T13:11:59.024Z|22136|vconn|DBG|unix:/var/run/openvswitch/br-int.mgmt: sent (Success): OFPT_BUNDLE_CONTROL (OF1.5) (xid=0x2415): bundle_id=0x5b2 type=OPEN_REQUEST flags=atomic ordered 2024-03-01T13:11:59.024Z|01345|vconn(ovn_pinctrl0)|DBG|unix:/var/run/openvswitch/br-int.mgmt: sent (Success): OFPT_PACKET_OUT (OF1.5) (xid=0x2418): in_port=CONTROLLER actions=set_field:0x64400001->reg0,set_field:0x64400003->reg1,set_field:0x4->reg9,set_field:0x9->reg10,set_field:0x2->reg11,set_field:0x3->reg13,set_field:0x1->reg14,set_field:0x2->reg15,set_field:0x4->metadata,set_field:0->pkt_mark,resubmit(CONTROLLER,37) data_len=1268 udp,vlan_tci=0x0000,dl_src=0a:58:64:40:00:03,dl_dst=0a:58:64:40:00:01,nw_src=172.18.0.5,nw_dst=10.244.1.9,nw_tos=0,nw_ecn=0,nw_ttl=63,nw_frag=no,tp_src=48588,tp_dst=90 udp_csum:be5f
This packet in is only observed when it fails, but not when it doesn't. So there might be some kind of problem with how that request is being re-injected in the pipeline.
Additionally, this might be easier to observe after the recent introduction of mac aging threshold since those ARPs might be triggered with more frequency.
However, all is speculation right now. When trying to trace the pipeline, we ran into another potential OVS trace issue that is preventing us to properly troubleshoot this.
An initial trace shows the packet in
❯ kubectl exec -ti -n ovn-kubernetes ovnkube-node-9bn6g -c ovnkube-controller -- bash -c "ovs-appctl ofproto/trace breth0 in_port=eth0,udp,dl_src=02:42:ac:12:00:05,dl_dst=02:42:ac:12:00:02,nw_src=172.18.0.5,nw_dst=172.18.0.2,nw_ttl=64,udp_src=33455,udp_dst=30501,dp_hash=1 --ct-next trk,new --ct-next trk,new --ct-next trk,new,dnat" ... Megaflow: recirc_id=0x54,ct_state=+new-est-rel-rpl-inv+trk+dnat,ct_mark=0/0x1,eth,udp,in_port=4,dl_src=02:42:ac:12:00:05,dl_dst=02:42:ac:12:00:02,nw_dst=10.244.1.0/24,nw_ttl=64,nw_frag=no Datapath actions: set(eth(src=0a:58:64:40:00:03,dst=00:00:00:00:00:00)),set(ipv4(ttl=63)),check_pkt_len(size=1414,gt(sample(sample=100.0%,actions(meter(3),userspace(pid=4294967295,controller(reason=1,dont_send=1,continuation=0,recirc_id=85,rule_cookie=0x5893bb55,controller_id=0,max_len=65535))))),le(sample(sample=100.0%,actions(meter(1),userspace(pid=4294967295,controller(reason=1,dont_send=1,continuation=0,recirc_id=86,rule_cookie=0x6c50f9f6,controller_id=0,max_len=65535))))))
but then OVS fails on us when trying to trace the re-injection:
❯ kubectl exec -ti -n ovn-kubernetes ovnkube-node-9bn6g -c ovnkube-controller -- ovs-appctl ofproto/trace-packet-out br-int --ct-next trk,new,dnat "in_port=controller,udp,vlan_tci=0x0000,dl_src=0a:58:64:40:00:03,dl_dst=0a:58:64:40:00:01,nw_src=172.18.0.5,nw_dst=10.244.1.9,nw_tos=0,nw_ecn=0,nw_ttl=63,nw_frag=no,udp_src=48588,udp_dst=90" "set_field:0x64400001->reg0,set_field:0x64400003->reg1,set_field:0x4->reg9,set_field:0x9->reg10,set_field:0x2->reg11,set_field:0x3->reg13,set_field:0x1->reg14,set_field:0x2->reg15,set_field:0x4->metadata,set_field:0->pkt_mark,resubmit(controller,37)" Flow: udp,in_port=CONTROLLER,vlan_tci=0x0000,dl_src=0a:58:64:40:00:03,dl_dst=0a:58:64:40:00:01,nw_src=172.18.0.5,nw_dst=10.244.1.9,nw_tos=0,nw_ecn=0,nw_ttl=63,nw_frag=no,tp_src=48588,tp_dst=90 ... thaw >>>> Recirculation conflict (actions)! <<<<
Dumping the datapath flows is not helping either, no way to trace what actually happens after re-injection
recirc_id(0),in_port(6),eth(src=0a:58:0a:f4:01:0e),eth_type(0x86dd),ipv6(frag=no),key32(00 00/00 00), packets:6, bytes:496, used:0.755s, actions:drop recirc_id(0),in_port(3),skb_mark(0),ct_state(-new-est-rpl-trk-dnat),eth(src=02:42:ac:12:00:05,dst=33:33:ff:12:00:05),eth_type(0x86dd),ipv6(src=::,dst=ff02::1:ff12:5,proto=58,hlimit=255,frag=no),key32(00 00/00 00),icmpv6(type=135,code=0),nd(target=fe80::42:acff:fe12:5,sll=00:00:00:00:00:00), packets:0, bytes:0, used:never, actions:4 recirc_id(0xd8),dp_hash(0xf/0xf),in_port(3),eth(),eth_type(0x0800),ipv4(frag=no), packets:0, bytes:0, used:never, actions:ct(commit,zone=2,mark=0xa/0xa,nat(dst=10.244.1.14:90)),recirc(0xd9) recirc_id(0xe4),in_port(6),ct_state(-new+est-rel+rpl-inv+trk+dnat),ct_mark(0/0x1),eth(src=02:42:ac:12:00:02,dst=02:42:ac:12:00:05),eth_type(0x0800),ipv4(src=10.244.1.0/255.255.255.0,dst=172.0.0.0/252.0.0.0,proto=17,frag=no),udp(src=0/0xc000), packets:0, bytes:0, used:never, actions:ct_clear,ct(commit,zone=64000,mark=0x1/0xffffffff),3 recirc_id(0),in_port(4),skb_mark(0),eth(src=02:42:ac:12:00:02,dst=02:42:ac:12:00:05),eth_type(0x0806), packets:0, bytes:0, used:never, actions:3 recirc_id(0),in_port(6),ct_state(-new-est-rel-rpl-inv-trk-dnat),ct_mark(0/0xf),eth(src=0a:58:0a:f4:01:0e,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=10.244.1.14,tip=10.244.1.1,op=1/0xff,sha=0a:58:0a:f4:01:0e,tha=00:00:00:00:00:00), packets:0, bytes:0, used:never, actions:userspace(pid=4294967295,slow_path(action)) recirc_id(0),in_port(3),skb_mark(0),ct_state(-new-est-rel-rpl-inv-trk-dnat),ct_mark(0/0xf),eth(src=02:42:ac:12:00:05,dst=33:33:00:00:00:02),eth_type(0x86dd),ipv6(src=fe80::/ffc0::,dst=ff02::2,proto=58,hlimit=255,frag=no),key32(00 00/00 00),icmpv6(type=133,code=0), packets:0, bytes:0, used:never, actions:4 recirc_id(0),in_port(3),skb_mark(0),ct_state(-new-est-rpl-trk-dnat),eth(src=02:42:ac:12:00:05,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(src=fe80::/ffc0::,dst=ff02::16,proto=58,hlimit=1,frag=no),key32(00 00/00 00),icmpv6(type=143), packets:1, bytes:110, used:1.331s, actions:4 recirc_id(0x31),in_port(3),ct_state(-est-rel+trk),ct_mark(0),eth(dst=02:42:ac:12:00:02),eth_type(0x0800),ipv4(proto=6,frag=no), packets:5, bytes:849, used:0.046s, flags:SP., actions:4 recirc_id(0x31),in_port(3),ct_state(+est+trk),ct_mark(0x2),eth(),eth_type(0x0800),ipv4(frag=no), packets:74359, bytes:11775341, used:0.031s, flags:SFPR., actions:4 recirc_id(0),in_port(3),skb_mark(0),eth(dst=02:42:ac:12:00:02),eth_type(0x0800),ipv4(proto=6,frag=no),tcp(dst=32768/0x8000), packets:1056, bytes:839062, used:0.523s, flags:SP., actions:ct(zone=64000,nat),recirc(0x31) recirc_id(0),in_port(3),skb_mark(0),ct_state(-new-est-rpl-trk-dnat),eth(src=02:42:ac:12:00:05,dst=33:33:00:00:00:16),eth_type(0x86dd),ipv6(src=::/ffc0::,dst=ff02::16,proto=58,hlimit=1,frag=no),key32(00 00/00 00),icmpv6(type=143), packets:1, bytes:110, used:2.739s, actions:4 recirc_id(0),in_port(3),skb_mark(0),ct_state(-new-est-rel-rpl-inv-trk-dnat),ct_mark(0/0x1),eth(src=02:42:ac:12:00:05,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=172.18.0.5,tip=172.18.0.2,op=1/0xff,sha=02:42:ac:12:00:05,tha=00:00:00:00:00:00), packets:0, bytes:0, used:never, actions:userspace(pid=4294967295,slow_path(action)) recirc_id(0),in_port(4),skb_mark(0),eth(),eth_type(0x0800),ipv4(dst=172.0.0.0/252.0.0.0,proto=6,frag=no), packets:104957, bytes:544825288, used:0.031s, flags:SFPR., actions:ct(commit,zone=64000,mark=0x2/0xffffffff),3 recirc_id(0xe3),in_port(6),ct_state(-new+est-rel+rpl-inv+trk+dnat),ct_mark(0/0xf),eth(src=0a:58:64:40:00:01,dst=0a:58:64:40:00:03),eth_type(0x0800),ipv4(dst=172.18.0.5,proto=17,ttl=63,frag=no), packets:0, bytes:0, used:never, actions:set(eth(src=02:42:ac:12:00:02,dst=02:42:ac:12:00:05)),set(ipv4(ttl=62)),ct(zone=2,nat),recirc(0xe4) recirc_id(0),in_port(6),eth(src=0a:58:0a:f4:01:0e,dst=0a:58:0a:f4:01:01),eth_type(0x0800),ipv4(src=10.244.1.14,dst=64.0.0.0/192.0.0.0,proto=17,frag=no),udp(src=0/0xfe00), packets:0, bytes:0, used:never, actions:ct(zone=19,nat),recirc(0xe1) recirc_id(0xe1),in_port(6),ct_state(-new+est-rel+rpl-inv+trk-dnat),ct_mark(0/0xf),eth(src=0a:58:0a:f4:01:0e,dst=0a:58:0a:f4:01:01),eth_type(0x0800),ipv4(src=10.244.1.8/255.255.255.248,dst=100.64.0.3,proto=17,ttl=64,frag=no),udp(src=0/0xfe00), packets:0, bytes:0, used:never, actions:ct_clear,set(eth(src=0a:58:64:40:00:01,dst=0a:58:64:40:00:03)),set(ipv4(ttl=63)),check_pkt_len(size=1414,gt(sample(sample=100.0%,actions(meter(4),userspace(pid=4294967295,controller(reason=1,dont_send=0,continuation=0,recirc_id=226,rule_cookie=0x7c33bd67,controller_id=0,max_len=65535))))),le(ct(nat),recirc(0xe3))) recirc_id(0xd6),in_port(3),eth(),eth_type(0x0800),ipv4(dst=172.18.0.2,frag=no), packets:0, bytes:0, used:never, actions:ct(zone=2,nat),recirc(0xd7) recirc_id(0),in_port(3),skb_mark(0),ct_state(-new-est-rpl-trk-dnat),eth(src=02:42:ac:12:00:05,dst=ff:ff:ff:ff:ff:ff),eth_type(0x0806),arp(sip=172.18.0.5,tip=172.18.0.1,op=1/0xff,sha=02:42:ac:12:00:05), packets:0, bytes:0, used:never, actions:4 recirc_id(0),in_port(3),ct_state(-new-est-rpl-trk-dnat),eth(src=02:42:ac:12:00:05,dst=02:42:ac:12:00:02),eth_type(0x0800),ipv4(src=172.18.0.5,dst=172.18.0.2,proto=17,ttl=64,frag=no),udp(dst=30261), packets:0, bytes:0, used:never, actions:ct(nat),recirc(0xd6) recirc_id(0),in_port(3),skb_mark(0),eth(dst=02:42:ac:12:00:02),eth_type(0x0800),ipv4(proto=6,frag=no),tcp(dst=8192/0xe000), packets:67, bytes:9683, used:0.031s, flags:SP., actions:ct(zone=64000,nat),recirc(0x31) recirc_id(0xd9),in_port(3),ct_state(+new-est-rel-rpl-inv+trk+dnat),ct_mark(0/0x1),eth(src=02:42:ac:12:00:05,dst=02:42:ac:12:00:02),eth_type(0x0800),ipv4(dst=10.244.1.0/255.255.255.0,proto=17,ttl=64,frag=no), packets:0, bytes:0, used:never, actions:set(eth(src=0a:58:64:40:00:03,dst=0a:58:64:40:00:01)),set(ipv4(ttl=63)),check_pkt_len(size=1414,gt(sample(sample=100.0%,actions(meter(4),userspace(pid=4294967295,controller(reason=1,dont_send=0,continuation=0,recirc_id=229,rule_cookie=0x5893bb55,controller_id=0,max_len=65535))))),le(ct(zone=2,nat),recirc(0xe6))) recirc_id(0xd7),in_port(3),ct_state(+new-rel+trk),eth(),eth_type(0x0800),ipv4(dst=172.18.0.2,proto=17,frag=no),udp(dst=30261), packets:0, bytes:0, used:never, actions:hash(l4(0)),recirc(0xd8)
I have opened an issue to OVS because of the trace problem. You will find DBs attached there
https://issues.redhat.com/browse/FDP-438
- is cloned by
-
FDP-583 CLONE [ovn24.03 fast-datapath-rhel-9] - Incorrect DNAT sequence from OVN-K gateway router nodeport load balancer
- Closed
- links to
-
RHBA-2024:130932 ovn23.09 bug fix and enhancement update