-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
4.18.z, 4.19.0, 4.20.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
Important
-
No
-
None
-
None
-
Rejected
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
When testing nodePort with ETP=cluster on LGW mode. und pod cannot access it's own nodePort service.
# oc get pod -n blue -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS test-rc-8m5jw 1/1 Running 1 27h 10.129.2.8 worker-2 <none> <none> name=client test-rc-krdjc 1/1 Running 1 27h 10.131.0.21 worker-1 <none> <none> name=test-pods # oc get svc hello-pod -n blue -o yaml apiVersion: v1 kind: Service metadata: creationTimestamp: "2025-04-24T07:20:52Z" labels: name: hello-pod name: hello-pod namespace: blue resourceVersion: "127354" uid: df4f173e-823c-473a-ada4-c73f33395375 spec: clusterIP: 172.30.34.214 clusterIPs: - 172.30.34.214 externalTrafficPolicy: Cluster internalTrafficPolicy: Cluster ipFamilies: - IPv4 ipFamilyPolicy: SingleStack ports: - name: http nodePort: 32705 port: 27017 protocol: TCP targetPort: 8080 selector: name: test-pods sessionAffinity: None type: NodePort status: loadBalancer: {}
this works well on SGW mode, but failed after converting to LGW mode.
# oc rsh -n blue test-rc-8m5jw ~ $ curl 192.168.111.20:32705 ^C ~ $ curl 192.168.111.20:32705 curl: (56) Recv failure: Connection reset by peer
##### tcpdump from client node
sh-5.1# tcpdump -i any -nn port 32705
tcpdump: data link type LINUX_SLL2
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
10:09:21.881927 c9bf4e28689f2_3 P IP 20.100.1.4.46378 > 192.168.111.20.32705: Flags [S], seq 690069817, win 65280, options [mss 1360,sackOK,TS val 2475200319 ecr 0,nop,wscale 7], length 0
10:09:21.882635 ovn-k8s-mp1 In IP 169.254.0.12.46378 > 192.168.111.20.32705: Flags [S], seq 690069817, win 65280, options [mss 1360,sackOK,TS val 2475200319 ecr 0,nop,wscale 7], length 0
10:09:21.882684 br-ex Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [S], seq 690069817, win 65280, options [mss 1360,sackOK,TS val 2475200319 ecr 0,nop,wscale 7], length 0
10:09:21.882886 enp2s0 Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [S], seq 690069817, win 65280, options [mss 1360,sackOK,TS val 2475200319 ecr 0,nop,wscale 7], length 0
10:09:21.885908 enp2s0 In IP 192.168.111.20.32705 > 192.168.111.25.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493850769 ecr 2475200319,nop,wscale 7], length 0
10:09:21.885914 br-ex In IP 192.168.111.20.32705 > 192.168.111.25.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493850769 ecr 2475200319,nop,wscale 7], length 0
10:09:21.885928 ovn-k8s-mp1 Out IP 192.168.111.20.32705 > 169.254.0.12.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493850769 ecr 2475200319,nop,wscale 7], length 0
10:09:21.886423 c9bf4e28689f2_3 Out IP 192.168.111.20.32705 > 20.100.1.4.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493850769 ecr 2475200319,nop,wscale 7], length 0
10:09:21.886458 c9bf4e28689f2_3 P IP 20.100.1.4.46378 > 192.168.111.20.32705: Flags [.], ack 1, win 510, options [nop,nop,TS val 2475200324 ecr 2493850769], length 0
10:09:21.886505 c9bf4e28689f2_3 P IP 20.100.1.4.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200324 ecr 2493850769], length 84
10:09:21.886749 ovn-k8s-mp1 In IP 169.254.0.12.46378 > 192.168.111.20.32705: Flags [.], ack 1, win 510, options [nop,nop,TS val 2475200324 ecr 2493850769], length 0
10:09:21.886768 br-ex Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [.], ack 1, win 510, options [nop,nop,TS val 2475200324 ecr 2493850769], length 0
10:09:21.886780 ovn-k8s-mp1 In IP 169.254.0.12.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200324 ecr 2493850769], length 84
10:09:21.886798 br-ex Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200324 ecr 2493850769], length 84
10:09:21.886941 enp2s0 Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [.], ack 1, win 510, options [nop,nop,TS val 2475200324 ecr 2493850769], length 0
10:09:21.886964 enp2s0 Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200324 ecr 2493850769], length 84
10:09:22.092081 c9bf4e28689f2_3 P IP 20.100.1.4.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200530 ecr 2493850769], length 84
10:09:22.092129 ovn-k8s-mp1 In IP 169.254.0.12.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200530 ecr 2493850769], length 84
10:09:22.092161 br-ex Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200530 ecr 2493850769], length 84
10:09:22.092172 enp2s0 Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200530 ecr 2493850769], length 84
10:09:22.300071 c9bf4e28689f2_3 P IP 20.100.1.4.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200738 ecr 2493850769], length 84
10:09:22.300125 ovn-k8s-mp1 In IP 169.254.0.12.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200738 ecr 2493850769], length 84
10:09:22.300151 br-ex Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200738 ecr 2493850769], length 84
10:09:22.300160 enp2s0 Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475200738 ecr 2493850769], length 84
10:09:22.716074 c9bf4e28689f2_3 P IP 20.100.1.4.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475201154 ecr 2493850769], length 84
10:09:22.716132 ovn-k8s-mp1 In IP 169.254.0.12.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475201154 ecr 2493850769], length 84
10:09:22.716156 br-ex Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475201154 ecr 2493850769], length 84
10:09:22.716166 enp2s0 Out IP 192.168.111.25.46378 > 192.168.111.20.32705: Flags [P.], seq 1:85, ack 1, win 510, options [nop,nop,TS val 2475201154 ecr 2493850769], length 84
10:09:22.886119 enp2s0 In IP 192.168.111.20.32705 > 192.168.111.25.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493851770 ecr 2475200319,nop,wscale 7], length 0
10:09:22.886133 br-ex In IP 192.168.111.20.32705 > 192.168.111.25.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493851770 ecr 2475200319,nop,wscale 7], length 0
10:09:22.886156 ovn-k8s-mp1 Out IP 192.168.111.20.32705 > 169.254.0.12.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493851770 ecr 2475200319,nop,wscale 7], length 0
10:09:22.886202 c9bf4e28689f2_3 Out IP 192.168.111.20.32705 > 20.100.1.4.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493851770 ecr 2475200319,nop,wscale 7], length 0
10:09:22.886218 c9bf4e28689f2_3 P IP 20.100.1.4.46378 > 192.168.111.20.32705: Flags [.], ack 1, win 510, options [nop,nop,TS val 2475201324 ecr 2493850769], length 0
10:09:22.886249 ovn-k8s-mp1 In IP 169.254.0.12.46378 > 192.168.111.20.32705: Flags [.], ack 1, win 510, options [nop,nop,TS val 2475201324 ecr 2493850769], length 0
tcpdump from server pod:
sh-5.1# tcpdump -i 08d49b3a853fc_3 -nn
dropped privs to tcpdump
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on 08d49b3a853fc_3, link-type EN10MB (Ethernet), snapshot length 262144 bytes
10:09:21.884614 IP 100.65.0.3.46378 > 20.100.5.5.8080: Flags [S], seq 690069817, win 65280, options [mss 1360,sackOK,TS val 2475200319 ecr 0,nop,wscale 7], length 0
10:09:21.884665 IP 20.100.5.5.8080 > 100.65.0.3.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493850769 ecr 2475200319,nop,wscale 7], length 0
10:09:22.885497 IP 20.100.5.5.8080 > 100.65.0.3.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493851770 ecr 2475200319,nop,wscale 7], length 0
10:09:24.933487 IP 20.100.5.5.8080 > 100.65.0.3.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493853818 ecr 2475200319,nop,wscale 7], length 0
10:09:26.982417 ARP, Request who-has 20.100.5.1 tell 20.100.5.5, length 28
10:09:26.983132 ARP, Reply 20.100.5.1 is-at 0a:58:14:64:05:01, length 28
10:09:28.965464 IP 20.100.5.5.8080 > 100.65.0.3.46378: Flags [S.], seq 535529498, ack 690069818, win 64704, options [mss 1360,sackOK,TS val 2493857850 ecr 2475200319,nop,wscale 7], length 0
Sounds like from above tcpdump show:
client --> SYN.
server -> SYN+ACK
client --> receive SYN+ACK and then send ACK
but server cannot receive ACK
Version-Release number of selected component (if applicable):
How reproducible:
Steps to Reproduce:
1. Create CUDN and namespace, pods (client, server)
2. Create nodeport service with ETP=Cluster
3. switch to LGW
4. curl master-nodeip: nodePort from udn-client pod
Actual results:
nodePort service cannot be accessed.
Expected results:
Additional info:
SGW mode works well with same configuration
Please fill in the following template while reporting a bug and provide as much relevant information as possible. Doing so will give us the best chance to find a prompt resolution.
Affected Platforms:
Is it an
- internal CI failure
- customer issue / SD
- internal RedHat testing failure
If it is an internal RedHat testing failure:
- Please share a kubeconfig or creds to a live cluster for the assignee to debug/troubleshoot along with reproducer steps (specially if it's a telco use case like ICNI, secondary bridges or BM+kubevirt).
If it is a CI failure:
- Did it happen in different CI lanes? If so please provide links to multiple failures with the same error instance
- Did it happen in both sdn and ovn jobs? If so please provide links to multiple failures with the same error instance
- Did it happen in other platforms (e.g. aws, azure, gcp, baremetal etc) ? If so please provide links to multiple failures with the same error instance
- When did the failure start happening? Please provide the UTC timestamp of the networking outage window from a sample failure run
- If it's a connectivity issue,
- What is the srcNode, srcIP and srcNamespace and srcPodName?
- What is the dstNode, dstIP and dstNamespace and dstPodName?
- What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
If it is a customer / SD issue:
- Provide enough information in the bug description that Engineering doesn't need to read the entire case history.
- Don't presume that Engineering has access to Salesforce.
- Do presume that Engineering will access attachments through supportshell.
- Describe what each relevant attachment is intended to demonstrate (failed pods, log errors, OVS issues, etc).
- Referring to the attached must-gather, sosreport or other attachment, please provide the following details:
- If the issue is in a customer namespace then provide a namespace inspect.
- If it is a connectivity issue:
- What is the srcNode, srcNamespace, srcPodName and srcPodIP?
- What is the dstNode, dstNamespace, dstPodName and dstPodIP?
- What is the traffic path? (examples: pod2pod? pod2external?, pod2svc? pod2Node? etc)
- Please provide the UTC timestamp networking outage window from must-gather
- Please provide tcpdump pcaps taken during the outage filtered based on the above provided src/dst IPs
- If it is not a connectivity issue:
- Describe the steps taken so far to analyze the logs from networking components (cluster-network-operator, OVNK, SDN, openvswitch, ovs-configure etc) and the actual component where the issue was seen based on the attached must-gather. Please attach snippets of relevant logs around the window when problem has happened if any.
- When showing the results from commands, include the entire command in the output.
- For OCPBUGS in which the issue has been identified, label with "sbr-triaged"
- For OCPBUGS in which the issue has not been identified and needs Engineering help for root cause, label with "sbr-untriaged"
- Do not set the priority, that is owned by Engineering and will be set when the bug is evaluated
- Note: bugs that do not meet these minimum standards will be closed with label "SDN-Jira-template"
- For guidance on using this template please see
OCPBUGS Template Training for Networking components