Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-20220

[Multi-NIC]Egress traffic connection got timeout after remove another pod label

XMLWordPrintable

    • Moderate
    • No
    • SDN Sprint 243, SDN Sprint 245, SDN Sprint 246, SDN Sprint 247, SDN Sprint 248, SDN Sprint 249
    • 6
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      [Multi-NIC]Egress traffic connect got timeout after remove another pod label in same namespace
      
      

      Version-Release number of selected component (if applicable):

      4.14.0-0.nightly-2023-10-08-024357
      
      
      

      How reproducible:

      Always
      
      

      Steps to Reproduce:

      1. Label one node as egress node
      2. Create an egressIP object, egressIP was assigned to egress node secondary interface
      # oc get egressip -o yaml
      apiVersion: v1
      items:
      - apiVersion: k8s.ovn.org/v1
        kind: EgressIP
        metadata:
          annotations:
            kubectl.kubernetes.io/last-applied-configuration: |
              {"apiVersion":"k8s.ovn.org/v1","kind":"EgressIP","metadata":{"annotations":{},"name":"egressip-66293"},"spec":{"egressIPs":["172.22.0.190"],"namespaceSelector":{"matchLabels":{"org":"qe"}},"podSelector":{"matchLabels":{"color":"pink"}}}}
          creationTimestamp: "2023-10-08T07:28:04Z"
          generation: 2
          name: egressip-66293
          resourceVersion: "461590"
          uid: f1ca3483-63f1-4f31-99b0-e6a55161c285
        spec:
          egressIPs:
          - 172.22.0.190
          namespaceSelector:
            matchLabels:
              org: qe
          podSelector:
            matchLabels:
              color: pink
        status:
          items:
          - egressIP: 172.22.0.190
            node: worker-0
      kind: List
      metadata:
        resourceVersion: ""
      3. Created a namespace and two pod under it. 
      % oc get pods -n hrw -o wide
      NAME         READY   STATUS    RESTARTS   AGE     IP            NODE       NOMINATED NODE   READINESS GATES
      hello-pod    1/1     Running   0          6m46s   10.129.2.7    worker-1   <none>           <none>
      hello-pod1   1/1     Running   0          6s      10.131.0.14   worker-0   <none>           <none>
      
      4. Add label org=qe to namespace hrw
      # oc get ns hrw --show-labels
      NAME   STATUS   AGE   LABELS
      hrw    Active   21m   kubernetes.io/metadata.name=hrw,*org=qe,*pod-security.kubernetes.io/audit-version=v1.24,pod-security.kubernetes.io/audit=restricted,pod-security.kubernetes.io/warn-version=v1.24,pod-security.kubernetes.io/warn=restricted
      
      5. At this time, from both pods to access external endpoint, succeeded. 
      % oc rsh -n hrw hello-pod 
      ~ $ curl 172.22.0.1 --connect-timeout 5
      <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
      <html><head>
      <title>404 Not Found</title>
      </head><body>
      <h1>Not Found</h1>
      <p>The requested URL was not found on this server.</p>
      </body></html>
      ~ $ exit
       % oc rsh -n hrw hello-pod1 
      ~ $ curl 172.22.0.1 --connect-timeout 5
      <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
      <html><head>
      <title>404 Not Found</title>
      </head><body>
      <h1>Not Found</h1>
      <p>The requested URL was not found on this server.</p>
      </body></html>
      
      6. Add label color=pink to both pods
       % oc label pod hello-pod color=pink -n hrw
      pod/hello-pod labeled
       % oc label pod hello-pod1 color=pink -n hrw
      pod/hello-pod1 labeled
      
      7. Both pods can access external endpoint.
      8. Remove label color=pink from pod hello-pod
      % oc label pod hello-pod color- -n hrw     
      pod/hello-pod unlabeled
      
      
      
      

      Actual results:

      
      Access external endpoint from the pod which keep the label got connect timeout
       % oc rsh -n hrw hello-pod1            
      ~ $ curl 172.22.0.1 --connect-timeout 5
      curl: (28) Connection timeout after 5000 ms
      ~ $ 
      ~ $ 
      ~ $ curl 172.22.0.1 --connect-timeout 5
      curl: (28) Connection timeout after 5000 ms
      
      Note the label was removed from hello-pod , but try to access external endpoint from another pod, here hello-pod1 which should still use egressIP and be able to access
      
      

      Expected results:

      Should be able to access external endpoint
      
      

      Additional info:

      
      

            mkennell@redhat.com Martin Kennelly
            huirwang Huiran Wang
            Huiran Wang Huiran Wang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated: