-
Bug
-
Resolution: Obsolete
-
Major
-
None
-
None
-
1
-
False
-
-
False
-
rhel-sst-network-fastdatapath
-
-
-
ssg_networking
When defining services of type LoadBalancer, NodePorts which also have session affinity set and are of type ETP=local, OVN generates these flows:
table=7 (lr_in_dnat ), priority=150 , match=(reg9[6] == 1 && ct.new && ip4 && reg4 == 10.128.2.32 && reg8[0..15] == 10001), action=(reg0 = 172.30.80.237; flags.fo rce_snat_for_lb = 1; ct_lb_mark(backends=10.128.2.32:10001; force_snat);) table=7 (lr_in_dnat ), priority=150 , match=(reg9[6] == 1 && ct.new && ip4 && reg4 == 10.128.2.32 && reg8[0..15] == 10001), action=(reg0 = 34.171.140.123; flags.s kip_snat_for_lb = 1; ct_lb_mark(backends=10.128.2.32:10001; skip_snat);)
which are at same priority so there is 50% chance one of these takes effect, -> that hampers the skip_snat versus force_snat from working correctly. For LB and NP the skip_snat should always work and for CIP the force_snat should always work, these above two flows have the exact same match expression which causes issues.
Attaching NBDB from a cluster stuck in this state since its pretty clear there is a logical flow problem here.
Reproducer:
- Create LBs:
LoadBalancer type service VIP LB: _uuid : 826f89aa-eb1b-4262-8eb5-1bf938fda258 external_ids : {"k8s.ovn.org/kind"=Service, "k8s.ovn.org/owner"="test-udp/udp-server-service"} health_check : [] ip_port_mappings : {} name : "Service_test-udp/udp-server-service_UDP_node_local_router_pdiak-01-18-2024-86ztk-worker-a-dfk6l" options : {affinity_timeout="120", event="false", hairpin_snat_ip="169.254.169.5 fd69::5", neighbor_responder=none, reject="true", skip_snat="true"} protocol : udp selection_fields : [] vips : {"34.171.140.123:10001"="10.128.2.32:10001"} ============ _uuid : 02b7544e-8969-4a4a-96c8-d4a27caa5d24 external_ids : {"k8s.ovn.org/kind"=Service, "k8s.ovn.org/owner"="test-udp/udp-server-service"} health_check : [] ip_port_mappings : {} name : "Service_test-udp/udp-server-service_UDP_node_switch_pdiak-01-18-2024-86ztk-worker-a-dfk6l" options : {affinity_timeout="120", event="false", hairpin_snat_ip="169.254.169.5 fd69::5", neighbor_responder=none, reject="true", skip_snat="false"} protocol : udp selection_fields : [] vips : {"34.171.140.123:10001"="10.128.2.32:10001"} ===== ClusterIP Type Service VIP LB for the same service: _uuid : 5c801402-a614-4702-adf0-eca61018b859 external_ids : {"k8s.ovn.org/kind"=Service, "k8s.ovn.org/owner"="test-udp/udp-server-service"} health_check : [] ip_port_mappings : {} name : "Service_test-udp/udp-server-service_UDP_cluster" options : {affinity_timeout="120", event="false", hairpin_snat_ip="169.254.169.5 fd69::5", neighbor_responder=none, reject="true", skip_snat="false"} protocol : udp selection_fields : [] vips : {"172.30.80.237:10001"="10.128.2.32:10001"}
this will cause flows with same matches at same priority which skews the SNATing logic for ETP=local+session-affinity-timeout set => see reg4 matches
- clones
-
FDP-290 Session Affinity + ETP=Local does not work for services of type NodePorts and LoadBalancers
- Dev Complete