Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-3942

Whereabouts CNI timesout while iterating exclude range [backport 4.11]

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Normal Normal
    • None
    • 4.10.z
    • Networking / multus
    • None
    • Important
    • Rejected
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      When creating a pod with an additional network that contains a `spec.config.ipam.exclude` range, any address within the excluded range is still iterated while searching for a suitable IP candidate. As a result, pod creation times out when large exclude ranges are used.

      Version-Release number of selected component (if applicable):

       

      How reproducible:

      with big exclude ranges, 100%

      Steps to Reproduce:

      1. create network-attachment-definition with a large range:
      
      $ cat <<EOF| oc apply -f -       
      apiVersion: k8s.cni.cncf.io/v1                                            
      kind: NetworkAttachmentDefinition
      metadata:
        name: nad-w-excludes
      spec:
        config: |-
          {
            "cniVersion": "0.3.1",
            "name": "macvlan-net",
            "type": "macvlan",
            "master": "ens3",
            "mode": "bridge",
            "ipam": {
               "type": "whereabouts",
               "range": "fd43:01f1:3daa:0baa::/64",
               "exclude": [ "fd43:01f1:3daa:0baa::/100" ],
               "log_file": "/tmp/whereabouts.log",
               "log_level" : "debug"
            }
          }
      EOF
      2. create a pod with the network attached:
      
      $ cat <<EOF|oc apply -f -
      apiVersion: v1
      kind: Pod
      metadata:
        name: pod-with-exclude-range
        annotations:
          k8s.v1.cni.cncf.io/networks: nad-w-excludes
      spec:
        containers:
        - name: pod-1
          image: openshift/hello-openshift
      EOF
      
      3. check pod status, event log and whereabouts logs after a while: 
      
      $ oc get pods
      NAME                        READY   STATUS              RESTARTS   AGE
      pod-with-exclude-range      0/1     ContainerCreating   0          2m23s
      
      $ oc get events
      <...>
      6m39s       Normal    Scheduled                                    pod/pod-with-exclude-range                   Successfully assigned default/pod-with-exclude-range to <worker-node>
      6m37s       Normal    AddedInterface                               pod/pod-with-exclude-range                   Add eth0 [10.129.2.49/23] from openshift-sdn
      2m39s       Warning   FailedCreatePodSandBox                       pod/pod-with-exclude-range                   Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded
      
      $ oc debug node/<worker-node> - tail /host/tmp/whereabouts.log
      Starting pod/<worker-node>-debug ...
      To use host binaries, run `chroot /host`
      2022-10-27T14:14:50Z [debug] Finished leader election
      2022-10-27T14:14:50Z [debug] IPManagement: {fd43:1f1:3daa:baa::1 ffffffffffffffff0000000000000000} , <nil>
      2022-10-27T14:14:59Z [debug] Used defaults from parsed flat file config @ /etc/kubernetes/cni/net.d/whereabouts.d/whereabouts.conf
      2022-10-27T14:14:59Z [debug] ADD - IPAM configuration successfully read: {Name:macvlan-net Type:whereabouts Routes:[] Datastore:kubernetes Addresses:[] OmitRanges:[fd43:01f1:3daa:0baa::/80] DNS: {Nameservers:[] Domain: Search:[] Options:[]} Range:fd43:1f1:3daa:baa::/64 RangeStart:fd43:1f1:3daa:baa:: RangeEnd:<nil> GatewayStr: EtcdHost: EtcdUsername: EtcdPassword:********* EtcdKeyFile: EtcdCertFile: EtcdCACertFile: LeaderLeaseDuration:1500 LeaderRenewDeadline:1000 LeaderRetryPeriod:500 LogFile:/tmp/whereabouts.log LogLevel:debug OverlappingRanges:true SleepForRace:0 Gateway:<nil> Kubernetes: {KubeConfigPath:/etc/kubernetes/cni/net.d/whereabouts.d/whereabouts.kubeconfig K8sAPIRoot:} ConfigurationPath:PodName:pod-with-exclude-range PodNamespace:default} 
      2022-10-27T14:14:59Z [debug] Beginning IPAM for ContainerID: f4ffd0e07d6c1a2b6ffb0fa29910c795258792bb1a1710ff66f6b48fab37af82
      2022-10-27T14:14:59Z [debug] Started leader election
      2022-10-27T14:14:59Z [debug] OnStartedLeading() called
      2022-10-27T14:14:59Z [debug] Elected as leader, do processing
      2022-10-27T14:14:59Z [debug] IPManagement - mode: 0 / containerID:f4ffd0e07d6c1a2b6ffb0fa29910c795258792bb1a1710ff66f6b48fab37af82 / podRef: default/pod-with-exclude-range
      2022-10-27T14:14:59Z [debug] IterateForAssignment input >> ip: fd43:1f1:3daa:baa:: | ipnet: {fd43:1f1:3daa:baa:: ffffffffffffffff0000000000000000} | first IP: fd43:1f1:3daa:baa::1 | last IP: fd43:1f1:3daa:baa:ffff:ffff:ffff:ffff

      Actual results:

      Failed to create pod sandbox: rpc error: code = DeadlineExceeded desc = context deadline exceeded

      Expected results:

      additional network gets attached to the pod

      Additional info:

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

       

            nsimha@redhat.com Nikhil Simha (Inactive)
            rhn-support-bverschu Bram Verschueren
            Weibin Liang Weibin Liang
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: