Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-70108

Duplicate SNAT rules being created if add ip addresses to the node interface

XMLWordPrintable

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      1. Customer upgraded cluster to "4.16.54", but still facing issue where after deleting EIP > rebuilding OVN-kubernetes DB > re-applying EIP, the nbdb showing stale SNAT rules.
      2. Comparing two clusters and reproducing the issue internally, turns out that customer enabled "OS to iDrac Pass-through" on worker nodes (Dell PowerEdge R750, Certified), this make nodes have NIC like "enp0s20f0u14u3", and this NIC is being assigned with IPv6 address periodically, and a stale SNAT rule will be added along with IP assignment.

      Version-Release number of selected component (if applicable):

       

      4.16.54, 4.18.29

       

      How reproducible:

      Always

       

      Steps to Reproduce:

      [From 4.16.54]

      1. Assign multiple EgressIPs to the cluster

      2. Check nbdb if there are any duplicate SNAT rules in GR_<NODENAME> router.

      $ for i in $nodeList; do   echo -e "--node: $i---\n";   pod=$(oc get po -n openshift-ovn-kubernetes  -o wide --no-headers| grep "$i" | awk '{print $1}');   oc -n openshift-ovn-kubernetes exec -it $pod -c nbdb ovn-nbctl lr-nat-list GR_$i | tail -n +2 | awk '{print $3}' | sort | uniq -c | awk '$1 > 1'; done
      
      --node: gcgs-s5t6q-worker-0---
      
      kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
      --node: gcgs-s5t6q-worker-1---
      
      kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.

       

      2. Manually assign a ipv6 address to the worker node interface 

      sh-5.1# ip -6 addr del fde1:53ba:e9a0:de11:d8b9:744:9a9d:6fab/64 dev enp3s0

       

      3. Check again, we can see the duplicate SNAT rules.

      $ for i in $nodeList; do   echo -e "--node: $i---\n";   pod=$(oc get po -n openshift-ovn-kubernetes  -o wide --no-headers| grep "$i" | awk '{print $1}');   oc -n openshift-ovn-kubernetes exec -it $pod -c nbdb ovn-nbctl lr-nat-list GR_$i | tail -n +2 | awk '{print $3}' | sort | uniq -c | awk '$1 > 1'; done
      --node: gcgs-s5t6q-worker-0---
      
      kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
      --node: gcgs-s5t6q-worker-1---
      
      kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
            2 10.128.2.13
            2 10.128.2.14
            2 10.128.2.15
            2 10.128.2.16
            2 10.128.2.17
            2 10.128.2.4
      # oc -n openshift-ovn-kubernetes exec -it $pod -c nbdb ovn-nbctl lr-nat-list GR_$i | grep 10.128.2.13
      kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
      snat                                   192.168.0.212                       10.128.2.13
      snat                                   192.168.0.248 <<<This is node IP    10.128.2.13

      5. After deleting the ip, the duplicate SNAT rules are still there.

      $ for i in $nodeList; do   echo -e "--node: $i---\n";   pod=$(oc get po -n openshift-ovn-kubernetes  -o wide --no-headers| grep "$i" | awk '{print $1}');   oc -n openshift-ovn-kubernetes exec -it $pod -c nbdb ovn-nbctl lr-nat-list GR_$i | tail -n +2 | awk '{print $3}' | sort | uniq -c | awk '$1 > 1'; done
      --node: gcgs-s5t6q-worker-0---
      
      kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
      --node: gcgs-s5t6q-worker-1---
      
      kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
            2 10.128.2.13
            2 10.128.2.14
            2 10.128.2.15
            2 10.128.2.16
            2 10.128.2.17
            2 10.128.2.4 

      6. only rebuilding OVN-kubernetes DB can restore the correct rules.

      [From 4.18.29] 

      Issue can be reproduced in 4.18.29 as well.
      Step to reproduce:

      1. create 10 namespace

      $ for i in {1..10}; do   echo "---
      apiVersion: v1
      kind: Namespace
      metadata:
        name: eip$i
        labels:
          env: eip$i" >> all_namespaces.yaml; done
      oc apply -f all_namespaces.yaml

      2. create 5 pods in each namespace, total 50 pods.

      for i in {1..10}; do oc apply -f deploy.yaml -n eip$i; done

      3. run below script:

      https://docs.google.com/document/d/1lhrnoiRiv2SS1GUHQnIPdKUrHLiSFTRaYNJrQdZw5NQ/edit?usp=drive_link

       

      Actual results:

      Duplicate SNAT rules

      Expected results:

      No Duplicate SNAT rules even if assign IP to the NIC

       

              huirwang Huiran Wang
              rhn-support-xingli Xingbin Li
              None
              None
              Anurag Saxena Anurag Saxena
              None
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated: