Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-45891

EgressIPs are not assigned to nodes following node scale down / scale up operation

XMLWordPrintable

    • Important
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      EgressIPs are not assigned to nodes following node scale down / scale up operation. The worker nodes are AWS m6a.16xlarge EC2 instances which can host up to 49 EgressIPs per node.
      Annotations:        cloud.network.openshift.io/egress-ipconfig:
                            [{"interface":"eni-xxxx","ifaddr":{"ipv4":"10.x.x.0/18"},"capacity":{"ipv4":49,"ipv6":50}}]

       

      Version-Release number of selected component (if applicable):

          EgressIPs with OVN-Kubernetes

       

      Steps to Reproduce:

      1. Create a worker machineset using m6a.16xlarge instance type     
      2. Create 76 eips
      
      for n in $(seq 5 80); do                                                                 
      cat <<EOF |oc apply -f -
      apiVersion: k8s.ovn.org/v1
      kind: EgressIP
      metadata:
        name: egress-10.0.40.${n}
      spec:
        egressIPs:
        - 10.0.40.$n
        namespaceSelector:
          matchLabels:
            env: eip
      EOF
      done                                                                                     
      
      3. Scale down the machineset to 0, and scale up to the previous number of instances 
           

      Actual results:

       1. For a while there are cloudprivateipconfigs in "CloudResponsePending" state, but eventually most succeed:
      
      $ oc get cloudprivateipconfigs -o yaml|yq '.items[].status.conditions[].reason'|sort|uniq -c 76 CloudResponseSucces
      
      2. However 3 egressips remain unassigned
      
      $ oc get eip --no-headers|awk '$3 == "" {print}'
      egress-10.0.40.10   10.0.40.10                                                  
      egress-10.0.40.13   10.0.40.13                                                  
      egress-10.0.40.15   10.0.40.15
      
      3. cloud-config-controller hits context deadline exceeded for 40.15, which is not assigned (see above):
      
      $ oc logs -n openshift-cloud-network-config-controller deploy/cloud-network-config-controller |grep 40.15
      I1206 13:20:18.154263       1 controller.go:182] Assigning key: 10.0.40.15 to cloud-private-ip-config workqueue
      I1206 13:20:18.240415       1 controller.go:182] Assigning key: ip-10-0-44-127.eu-central-1.compute.internal to node workqueue
      I1206 13:20:20.695067       1 cloudprivateipconfig_controller.go:439] Added IP address to node: "ip-10-0-32-145.eu-central-1.compute.internal" for CloudPrivateIPConfig: "10.0.40.15"
      I1206 13:20:22.342670       1 controller.go:160] Dropping key '10.0.40.15' from the cloud-private-ip-config workqueue
      E1206 13:20:49.140656       1 controller.go:165] error syncing '10.0.40.15': Get "https://api-int.bverschu.emea.aws.cee.support:6443/apis/cloud.network.openshift.io/v1/cloudprivateipconfigs/10.0.40.15": context deadline exceeded, requeuing in cloud-private-ip-config workqueue
      E1206 13:21:10.340015       1 controller.go:165] error syncing '10.0.40.68': error updating CloudPrivateIPConfig: "10.0.40.68", err: Put "https://api-int.bverschu.emea.aws.cee.support:6443/apis/cloud.network.openshift.io/v1/cloudprivateipconfigs/10.0.40.68/status": context deadline exceeded, requeuing in cloud-private-ip-config workqueue
      E1206 13:21:14.539786       1 controller.go:165] error syncing '10.0.40.15': Get "https://api-int.bverschu.emea.aws.cee.support:6443/apis/cloud.network.openshift.io/v1/cloudprivateipconfigs/10.0.40.15": context deadline exceeded, requeuing in cloud-private-ip-config workqueue
      I1206 13:21:39.339112       1 controller.go:160] Dropping key '10.0.40.15' from the cloud-private-ip-config workqueue   

      Expected results:

      All EgressIPs are assigned to nodes.    

      Additional info:

          

              sdn-team-bot sdn-team bot
              rhn-support-ekasprzy Emmanuel Kasprzyk
              Anurag Saxena Anurag Saxena
              Votes:
              1 Vote for this issue
              Watchers:
              8 Start watching this issue

                Created:
                Updated: