-
Bug
-
Resolution: Done-Errata
-
Major
-
4.14
-
Quality / Stability / Reliability
-
False
-
-
None
-
Moderate
-
No
-
None
-
Rejected
-
SDN Sprint 243, SDN Sprint 245, SDN Sprint 246, SDN Sprint 247, SDN Sprint 248, SDN Sprint 249
-
6
-
Done
-
Bug Fix
-
-
None
-
None
-
None
-
None
Description of problem:
[Multi-NIC]Egress traffic connect got timeout after remove another pod label in same namespace
Version-Release number of selected component (if applicable):
4.14.0-0.nightly-2023-10-08-024357
How reproducible:
Always
Steps to Reproduce:
1. Label one node as egress node
2. Create an egressIP object, egressIP was assigned to egress node secondary interface
# oc get egressip -o yaml
apiVersion: v1
items:
- apiVersion: k8s.ovn.org/v1
kind: EgressIP
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"k8s.ovn.org/v1","kind":"EgressIP","metadata":{"annotations":{},"name":"egressip-66293"},"spec":{"egressIPs":["172.22.0.190"],"namespaceSelector":{"matchLabels":{"org":"qe"}},"podSelector":{"matchLabels":{"color":"pink"}}}}
creationTimestamp: "2023-10-08T07:28:04Z"
generation: 2
name: egressip-66293
resourceVersion: "461590"
uid: f1ca3483-63f1-4f31-99b0-e6a55161c285
spec:
egressIPs:
- 172.22.0.190
namespaceSelector:
matchLabels:
org: qe
podSelector:
matchLabels:
color: pink
status:
items:
- egressIP: 172.22.0.190
node: worker-0
kind: List
metadata:
resourceVersion: ""
3. Created a namespace and two pod under it.
% oc get pods -n hrw -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hello-pod 1/1 Running 0 6m46s 10.129.2.7 worker-1 <none> <none>
hello-pod1 1/1 Running 0 6s 10.131.0.14 worker-0 <none> <none>
4. Add label org=qe to namespace hrw
# oc get ns hrw --show-labels
NAME STATUS AGE LABELS
hrw Active 21m kubernetes.io/metadata.name=hrw,*org=qe,*pod-security.kubernetes.io/audit-version=v1.24,pod-security.kubernetes.io/audit=restricted,pod-security.kubernetes.io/warn-version=v1.24,pod-security.kubernetes.io/warn=restricted
5. At this time, from both pods to access external endpoint, succeeded.
% oc rsh -n hrw hello-pod
~ $ curl 172.22.0.1 --connect-timeout 5
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL was not found on this server.</p>
</body></html>
~ $ exit
% oc rsh -n hrw hello-pod1
~ $ curl 172.22.0.1 --connect-timeout 5
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>404 Not Found</title>
</head><body>
<h1>Not Found</h1>
<p>The requested URL was not found on this server.</p>
</body></html>
6. Add label color=pink to both pods
% oc label pod hello-pod color=pink -n hrw
pod/hello-pod labeled
% oc label pod hello-pod1 color=pink -n hrw
pod/hello-pod1 labeled
7. Both pods can access external endpoint.
8. Remove label color=pink from pod hello-pod
% oc label pod hello-pod color- -n hrw
pod/hello-pod unlabeled
Actual results:
Access external endpoint from the pod which keep the label got connect timeout % oc rsh -n hrw hello-pod1 ~ $ curl 172.22.0.1 --connect-timeout 5 curl: (28) Connection timeout after 5000 ms ~ $ ~ $ ~ $ curl 172.22.0.1 --connect-timeout 5 curl: (28) Connection timeout after 5000 ms Note the label was removed from hello-pod , but try to access external endpoint from another pod, here hello-pod1 which should still use egressIP and be able to access
Expected results:
Should be able to access external endpoint
Additional info:
- links to
-
RHEA-2024:0041
OpenShift Container Platform 4.16.z bug fix update