Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-11828

IP conflict while recreating Pod with fixed name

XMLWordPrintable

      This bug is a backport clone of [Bugzilla Bug 1983056](https://bugzilla.redhat.com/show_bug.cgi?id=1983056). The following is the description of the original bug:

      Description of problem:

      During upgrade of 4.5.40 to 4.6.31 the CNI is restarting due to unable to plug the VIF provided as it is already being used by another Pod.

      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service [-] Error when processing addNetwork request. CNI Params:

      {'CNI_IFNAME': 'eth0', 'CNI_NETNS': '/var/run/netns/0420f2a3-d2fe-40e6-86f0-9a38a17c933a', 'CNI_PATH': '/opt/multus/bin:/var/lib/cni/bin:/usr/libexec/cni', 'CNI_COMMAND': 'ADD', 'CNI_CONTAINERID': '73eee9240ae6bcfec8b539fa2b12c8e82f51f8a95f29aaaedc95e4e05f7cb734', 'CNI_ARGS': 'IgnoreUnknown=true;K8S_POD_NAMESPACE=openshift-monitoring;K8S_POD_NAME=prometheus-k8s-0;K8S_POD_INFRA_CONTAINER_ID=73eee9240ae6bcfec8b539fa2b12c8e82f51f8a95f29aaaedc95e4e05f7cb734'}

      : pyroute2.netlink.exceptions.NetlinkError: (17, 'File exists')
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service Traceback (most recent call last):
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/cni/daemon/service.py", line 82, in add
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service vif = self.plugin.add(params)
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/cni/plugins/k8s_cni_registry.py", line 75, in add
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service vifs = self._do_work(params, b_base.connect, timeout)
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/cni/plugins/k8s_cni_registry.py", line 184, in _do_work
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service container_id=params.CNI_CONTAINERID)
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/cni/binding/base.py", line 156, in connect
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service driver.connect(vif, ifname, netns, container_id)
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/cni/binding/nested.py", line 126, in connect
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service iface.net_ns_fd = utils.convert_netns(netns)
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/pyroute2/ipdb/transactional.py", line 209, in _exit_
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service self.commit()
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/pyroute2/ipdb/interfaces.py", line 650, in commit
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service raise newif
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/pyroute2/ipdb/interfaces.py", line 589, in commit
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service self.nl.link('add', **request)
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/pyroute2/iproute/linux.py", line 1163, in link
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service msg_flags=msg_flags)
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 373, in nlm_request
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service return tuple(self._genlm_request(*argv, **kwarg))
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 864, in nlm_request
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service callback=callback):
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 376, in get
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service return tuple(self._genlm_get(*argv, **kwarg))
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service File "/usr/lib/python3.6/site-packages/pyroute2/netlink/nlsocket.py", line 701, in get
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service raise msg['header']['error']
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service pyroute2.netlink.exceptions.NetlinkError: (17, 'File exists')
      2021-07-16 10:55:02.580 232 ERROR kuryr_kubernetes.cni.daemon.service
      2021-07-16 10:55:02.585 232 INFO werkzeug [-] 127.0.0.1 - - [16/Jul/2021 10:55:02] "POST /addNetwork HTTP/1.1" 500 -
      2021-07-16 10:55:02.656 251 INFO os_vif [-] Successfully unplugged vif VIFVlanNested(active=True,address=fa:16:3e:c1:cd:25,has_traffic_filtering=False,id=88bdb7f9-65e6-4c54-83d1-73341876da08,network=Network(cc5c0761-5f89-42b8-a4fc-0d829eba818d),plugin='noop',port_profile=<?>,preserve_on_delete=False,vif_name='tap88bdb7f9-65',vlan_id=2482)

      The prometheus Pod is configured to used the same IP as the alert Pod, and the alert Pod is using IP different than the one specified on annotation:

      [stack@undercloud-0 ~]$ oc get po prometheus-k8s-0 -n openshift-monitoring -o yaml
      apiVersion: v1
      kind: Pod
      metadata:
      annotations:
      openshift.io/scc: anyuid
      openstack.org/kuryr-pod-label: '

      {"app": "prometheus", "controller-revision-hash": "prometheus-k8s-5949f47544", "prometheus": "k8s", "statefulset.kubernetes.io/pod-name": "prometheus-k8s-0"}

      '
      openstack.org/kuryr-vif: '{"versioned_object.changes": ["default_vif"], "versioned_object.data":
      {"additional_vifs": {}, "default_vif": {"versioned_object.changes": ["has_traffic_filtering",
      "plugin", "active", "vif_name", "preserve_on_delete", "network", "id", "address",
      "vlan_id"], "versioned_object.data": {"active": true, "address": "fa:16:3e:c1:cd:25",
      "has_traffic_filtering": false, "id": "88bdb7f9-65e6-4c54-83d1-73341876da08",
      "network": {"versioned_object.changes": ["mtu", "multi_host", "subnets", "label",
      "id", "should_provide_bridge", "should_provide_vlan"], "versioned_object.data":
      {"id": "cc5c0761-5f89-42b8-a4fc-0d829eba818d", "label": "ns/openshift-monitoring-net",
      "mtu": 1442, "multi_host": false, "should_provide_bridge": false, "should_provide_vlan":
      false, "subnets": {"versioned_object.changes": ["objects"], "versioned_object.data":
      {"objects": [{"versioned_object.changes": ["ips", "gateway", "routes", "cidr",
      "dns"], "versioned_object.data": {"cidr": "10.128.8.0/23", "dns": [], "gateway":
      "10.128.8.1", "ips": {"versioned_object.changes": ["objects"], "versioned_object.data":
      {"objects": [{"versioned_object.changes": ["address"], "versioned_object.data":

      {"address": "10.128.9.175"}

      , "versioned_object.name": "FixedIP", "versioned_object.namespace":
      "os_vif", "versioned_object.version": "1.0"}]}, "versioned_object.name": "FixedIPList",
      "versioned_object.namespace": "os_vif", "versioned_object.version": "1.0"},
      "routes": {"versioned_object.changes": ["objects"], "versioned_object.data":

      {"objects": []}

      , "versioned_object.name": "RouteList", "versioned_object.namespace":
      "os_vif", "versioned_object.version": "1.0"}}, "versioned_object.name": "Subnet",
      "versioned_object.namespace": "os_vif", "versioned_object.version": "1.0"}]},
      "versioned_object.name": "SubnetList", "versioned_object.namespace": "os_vif",
      "versioned_object.version": "1.0"}}, "versioned_object.name": "Network", "versioned_object.namespace":
      "os_vif", "versioned_object.version": "1.1"}, "plugin": "noop", "preserve_on_delete":
      false, "vif_name": "tap88bdb7f9-65", "vlan_id": 2482}, "versioned_object.name":
      "VIFVlanNested", "versioned_object.namespace": "os_vif", "versioned_object.version":
      "1.0"}}, "versioned_object.name": "PodState", "versioned_object.namespace":
      "os_vif", "versioned_object.version": "1.0"}'
      creationTimestamp: "2021-07-15T12:24:52Z"
      generateName: prometheus-k8s-
      labels:
      app: prometheus
      controller-revision-hash: prometheus-k8s-5949f47544
      prometheus: k8s
      statefulset.kubernetes.io/pod-name: prometheus-k8s-0
      name: prometheus-k8s-0
      namespace: openshift-monitoring
      ownerReferences:

      • apiVersion: apps/v1
        blockOwnerDeletion: true
        controller: true
        kind: StatefulSet
        name: prometheus-k8s
        uid: 08334f30-2552-499b-9245-e4f61fe92a76
        resourceVersion: "112100"
        selfLink: /api/v1/namespaces/openshift-monitoring/pods/prometheus-k8s-0
        uid: 1087ca8f-9f00-486a-8471-60956e9c27a4

      [stack@undercloud-0 ~]$ oc get po -A -o wide |grep 10.128.9.175
      openshift-monitoring alertmanager-main-2 5/5 Running 0 22h 10.128.9.175 ostest-f57bt-worker-vprrk <none> <none>
      [stack@undercloud-0 ~]$ oc get po alertmanager-main-2 -n openshift-monitoring -o yaml
      apiVersion: v1
      kind: Pod
      metadata:
      annotations:
      k8s.v1.cni.cncf.io/network-status: |-
      [{
      "name": "kuryr",
      "interface": "eth0",
      "ips": [
      "10.128.9.175"
      ],
      "mac": "fa:16:3e:c1:cd:25",
      "default": true,
      "dns": {}
      }]
      k8s.v1.cni.cncf.io/networks-status: |-
      [{
      "name": "kuryr",
      "interface": "eth0",
      "ips": [
      "10.128.9.175"
      ],
      "mac": "fa:16:3e:c1:cd:25",
      "default": true,
      "dns": {}
      }]
      openshift.io/scc: anyuid
      openstack.org/kuryr-pod-label: '

      {"alertmanager": "main", "app": "alertmanager", "controller-revision-hash": "alertmanager-main-5548759bbd", "statefulset.kubernetes.io/pod-name": "alertmanager-main-2"}

      '
      openstack.org/kuryr-vif: '{"versioned_object.changes": ["default_vif"], "versioned_object.data":
      {"additional_vifs": {}, "default_vif": {"versioned_object.changes": ["active",
      "has_traffic_filtering", "network", "address", "id", "preserve_on_delete", "vlan_id",
      "plugin", "vif_name"], "versioned_object.data": {"active": true, "address":
      "fa:16:3e:77:a3:12", "has_traffic_filtering": false, "id": "f6dd52db-40e1-4339-a7e6-1e2bd2f6f772",
      "network": {"versioned_object.changes": ["multi_host", "label", "should_provide_vlan",
      "should_provide_bridge", "mtu", "id", "subnets"], "versioned_object.data": {"id":
      "cc5c0761-5f89-42b8-a4fc-0d829eba818d", "label": "ns/openshift-monitoring-net",
      "mtu": 1442, "multi_host": false, "should_provide_bridge": false, "should_provide_vlan":
      false, "subnets": {"versioned_object.changes": ["objects"], "versioned_object.data":
      {"objects": [{"versioned_object.changes": ["routes", "dns", "cidr", "gateway",
      "ips"], "versioned_object.data": {"cidr": "10.128.8.0/23", "dns": [], "gateway":
      "10.128.8.1", "ips": {"versioned_object.changes": ["objects"], "versioned_object.data":
      {"objects": [{"versioned_object.changes": ["address"], "versioned_object.data":

      {"address": "10.128.9.238"}

      , "versioned_object.name": "FixedIP", "versioned_object.namespace":
      "os_vif", "versioned_object.version": "1.0"}]}, "versioned_object.name": "FixedIPList",
      "versioned_object.namespace": "os_vif", "versioned_object.version": "1.0"},
      "routes": {"versioned_object.changes": ["objects"], "versioned_object.data":

      {"objects": []}

      , "versioned_object.name": "RouteList", "versioned_object.namespace":
      "os_vif", "versioned_object.version": "1.0"}}, "versioned_object.name": "Subnet",
      "versioned_object.namespace": "os_vif", "versioned_object.version": "1.0"}]},
      "versioned_object.name": "SubnetList", "versioned_object.namespace": "os_vif",
      "versioned_object.version": "1.0"}}, "versioned_object.name": "Network", "versioned_object.namespace":
      "os_vif", "versioned_object.version": "1.1"}, "plugin": "noop", "preserve_on_delete":
      false, "vif_name": "tapf6dd52db-40", "vlan_id": 3914}, "versioned_object.name":
      "VIFVlanNested", "versioned_object.namespace": "os_vif", "versioned_object.version":
      "1.0"}}, "versioned_object.name": "PodState", "versioned_object.namespace":
      "os_vif", "versioned_object.version": "1.0"}'
      creationTimestamp: "2021-07-15T12:23:41Z"
      generateName: alertmanager-main-
      labels:
      alertmanager: main
      app: alertmanager
      controller-revision-hash: alertmanager-main-5548759bbd
      statefulset.kubernetes.io/pod-name: alertmanager-main-2
      name: alertmanager-main-2
      namespace: openshift-monitoring

      (shiftstack) [stack@undercloud-0 ~]$ openstack port list |grep 10.128.9.175

      88bdb7f9-65e6-4c54-83d1-73341876da08   fa:16:3e:c1:cd:25 ip_address='10.128.9.175', subnet_id='a4ee6044-8ddd-4dbf-bcd3-22f95ec4ce16' ACTIVE

      (shiftstack) [stack@undercloud-0 ~]$ openstack port list |grep 10.128.9.238

      f6dd52db-40e1-4339-a7e6-1e2bd2f6f772   fa:16:3e:77:a3:12 ip_address='10.128.9.238', subnet_id='a4ee6044-8ddd-4dbf-bcd3-22f95ec4ce16' ACTIVE

      (shiftstack) [stack@undercloud-0 ~]$ oc get co
      NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE
      authentication 4.6.31 True False False 22h
      cloud-credential 4.6.31 True False False 26h
      cluster-autoscaler 4.6.31 True False False 25h
      config-operator 4.6.31 True False False 25h
      console 4.6.31 True False False 22h
      csi-snapshot-controller 4.6.31 True False False 25h
      dns 4.5.40 True False False 25h
      etcd 4.6.31 True False False 25h
      image-registry 4.6.31 True False False 25h
      ingress 4.6.31 True False False 22h
      insights 4.6.31 True False False 25h
      kube-apiserver 4.6.31 True False False 25h
      kube-controller-manager 4.6.31 True False False 25h
      kube-scheduler 4.6.31 True False False 25h
      kube-storage-version-migrator 4.6.31 True False False 25h
      machine-api 4.6.31 True False False 25h
      machine-approver 4.6.31 True False False 25h
      machine-config 4.5.40 True False False 23h
      marketplace 4.6.31 True False False 22h
      monitoring 4.5.40 False True True 22h
      network 4.5.40 True True False 25h
      node-tuning 4.6.31 True False False 22h
      openshift-apiserver 4.6.31 True False False 25h
      openshift-controller-manager 4.6.31 True False False 22h
      openshift-samples 4.6.31 True False False 22h
      operator-lifecycle-manager 4.6.31 True False False 25h
      operator-lifecycle-manager-catalog 4.6.31 True False False 25h
      operator-lifecycle-manager-packageserver 4.6.31 True False False 22h
      service-ca 4.6.31 True False False 25h
      storage 4.6.31 True False False 22h

      (shiftstack) [stack@undercloud-0 ~]$ oc get po -A -o wide |grep 10.128.9.238 |wc -l
      0

      The same issue would be possible on 3.11 as it's also based on Annotations.

      Version-Release number of selected component (if applicable):

      Red Hat OpenStack Platform release 16.1.6 GA
      How reproducible:

      Steps to Reproduce:
      1.
      2.
      3.

      Actual results:

      Expected results:

      Additional info:

            mdulko Michał Dulko
            openshift-crt-jira-prow OpenShift Prow Bot
            Ramón Lobillo Ramón Lobillo
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

              Created:
              Updated:
              Resolved: