-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
4.12.z
-
None
-
Moderate
-
No
-
False
-
Description of problem:
we have noticed that when deleting a pod which consumes additional networks using NetworkAttachmentDefinitions, the following error is created and the pod is deleted anyway: 48m Warning IPAddressGarbageCollectionFailed pod/helloworld-74bc99864b-98x2f failed to garbage collect addresses for pod bug-address-garbage-collection/helloworld-74bc99864b-98x2f After looking in the whereabouts-reconciler pods, we can also see errors showing that the reconciler is unable to clean up the addresses.
Version-Release number of selected component (if applicable):
4.12.x
How reproducible:Everytime
We can reproduce this issue with below steps:
1. Create Net-Attach-Def with 5 IPs in the range
2. Whereabouts-reconciler pods should be available in openshift-multus ns.
3. Create a Deployment with 2 replicas using the same net-attach-def
4. Restart one of the pods and check the whereabouts-reconciler pod logs on the same node.
5. You will get the below error message in pods
6. Though it will not create an issue these errors are misleading.
~~~
[quickcluster@upi-0 nadtesting]$ cat nad.yaml
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: macvlan-net-attach1
spec:
config: '{
"cniVersion": "0.3.1",
"type": "macvlan",
"master": "br-ex",
"mode": "bridge",
"ipam":
}'
~~~
~~~
[quickcluster@upi-0 nadtesting]$ oc get pods -n nadtesting -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
deployment1-0 1/1 Running 0 79s 10.128.2.96 worker-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
deployment1-1 1/1 Running 0 78s 10.129.2.10 worker-0.ketanl.lab.psi.pnq2.redhat.com <none> <none>
[quickcluster@upi-0 nadtesting]$ oc delete pod deployment1-1 -n nadtesting
pod "deployment1-1" deleted
~~~
~~~
[quickcluster@upi-0 nadtesting]$ oc get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
multus-5phgg 1/1 Running 5 (20h ago) 6d3h 10.74.210.135 master-1.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-5st4p 1/1 Running 1 6d3h 10.74.208.133 worker-1.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-97mfn 1/1 Running 16 (20h ago) 6d3h 10.74.212.72 master-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-additional-cni-plugins-4svgp 1/1 Running 1 6d3h 10.74.212.72 master-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-additional-cni-plugins-krsnn 1/1 Running 1 6d3h 10.74.208.133 worker-1.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-additional-cni-plugins-qww45 1/1 Running 2 6d3h 10.74.209.119 worker-0.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-additional-cni-plugins-rlsgw 1/1 Running 1 6d3h 10.74.212.93 worker-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-additional-cni-plugins-s8z72 1/1 Running 1 6d3h 10.74.210.230 master-0.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-additional-cni-plugins-wfdwt 1/1 Running 1 6d3h 10.74.210.135 master-1.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-admission-controller-c7c5656f6-6xgrv 2/2 Running 0 20h 10.130.0.56 master-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-admission-controller-c7c5656f6-89nnh 2/2 Running 0 20h 10.130.0.55 master-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-ffvj9 1/1 Running 1 6d3h 10.74.212.93 worker-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-nf2j8 1/1 Running 1 6d3h 10.74.210.230 master-0.ketanl.lab.psi.pnq2.redhat.com <none> <none>
multus-tt2p8 1/1 Running 2 6d3h 10.74.209.119 worker-0.ketanl.lab.psi.pnq2.redhat.com <none> <none>
network-metrics-daemon-622bg 2/2 Running 2 6d3h 10.128.2.3 worker-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
network-metrics-daemon-68wqh 2/2 Running 4 6d3h 10.129.2.4 worker-0.ketanl.lab.psi.pnq2.redhat.com <none> <none>
network-metrics-daemon-bbh82 2/2 Running 2 6d3h 10.131.0.4 worker-1.ketanl.lab.psi.pnq2.redhat.com <none> <none>
network-metrics-daemon-kn5dd 2/2 Running 2 6d3h 10.129.0.4 master-1.ketanl.lab.psi.pnq2.redhat.com <none> <none>
network-metrics-daemon-rj9x5 2/2 Running 2 6d3h 10.130.0.3 master-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
network-metrics-daemon-sl7lk 2/2 Running 2 6d3h 10.128.0.4 master-0.ketanl.lab.psi.pnq2.redhat.com <none> <none>
whereabouts-reconciler-5mg5z 1/1 Running 0 2m49s 10.74.208.133 worker-1.ketanl.lab.psi.pnq2.redhat.com <none> <none>
whereabouts-reconciler-8kfdr 1/1 Running 0 2m49s 10.74.210.135 master-1.ketanl.lab.psi.pnq2.redhat.com <none> <none>
whereabouts-reconciler-gcrpl 1/1 Running 0 2m49s 10.74.210.230 master-0.ketanl.lab.psi.pnq2.redhat.com <none> <none>
whereabouts-reconciler-hlhpc 1/1 Running 0 2m49s 10.74.212.93 worker-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
whereabouts-reconciler-j9pkk 1/1 Running 0 2m49s 10.74.212.72 master-2.ketanl.lab.psi.pnq2.redhat.com <none> <none>
*whereabouts-reconciler-xsrbw 1/1 Running 0 2m49s 10.74.209.119 worker-0.ketanl.lab.psi.pnq2.redhat.com <none> <none>*
~~~
~~~
[quickcluster@upi-0 nadtesting]$ oc describe ippools.whereabouts.cni.cncf.io 172.17.20.0-24
Name: 172.17.20.0-24
Namespace: openshift-multus
Labels: <none>
Annotations: <none>
API Version: whereabouts.cni.cncf.io/v1alpha1
Kind: IPPool
Metadata:
Creation Timestamp: 2024-01-29T10:04:14Z
Generation: 3
Resource Version: 2851669
UID: 02da6e7d-683b-4ffb-8984-22ac5fb622e2
Spec:
Allocations:
11:
Id: 3786105b18879187206480002bcc09ab54c355b99046a21463e2e951b900f837
Podref: nadtesting/deployment1-0
12:
Id: 96f5e5e6a35d992011f3f5fcad225e6e18051eceee1636ee37e0390552c56194
Podref: nadtesting/deployment1-1
Range: 172.17.20.0/24
Events: <none>
~~~
~~~
[quickcluster@upi-0 nadtesting]$ oc logs whereabouts-reconciler-xsrbw
2024-01-29T10:03:03Z [debug] Filtering pods with filter key 'spec.nodeName' and filter value 'worker-0.ketanl.lab.psi.pnq2.redhat.com'
2024-01-29T10:03:03Z [verbose] pod controller created
2024-01-29T10:03:03Z [verbose] Starting informer factories ...
2024-01-29T10:03:03Z [verbose] Informer factories started
2024-01-29T10:03:03Z [verbose] starting network controller
2024-01-29T10:06:08Z [verbose] deleted pod [nadtesting/deployment1-1]
2024-01-29T10:06:08Z [verbose] skipped net-attach-def for default network
2024-01-29T10:06:08Z [debug] pod's network status: {Name:nadtesting/macvlan-net-attach1 Interface:net1 IPs:[172.17.20.12] Mac:26:c4:95:37:a4:d8 Default:false DNS:
DeviceInfo:<nil>}
2024-01-29T10:06:08Z [verbose] the NAD's config: {{ "cniVersion": "0.3.1", "type": "macvlan", "master": "br-ex", "mode": "bridge", "ipam":
}}
2024-01-29T10:06:08Z [debug] Used defaults from parsed flat file config @ /host/etc/cni/net.d/whereabouts.d/whereabouts.conf
2024-01-29T10:06:08Z [verbose] result of garbage collecting pods: failed to get the IPPool data: ippool.whereabouts.cni.cncf.io "172.17.20.0-24" not found
2024-01-29T10:06:08Z [verbose] re-queuing IP address reconciliation request for pod nadtesting/deployment1-1; retry #: 0
2024-01-29T10:06:08Z [verbose] skipped net-attach-def for default network
2024-01-29T10:06:08Z [debug] pod's network status: {Name:nadtesting/macvlan-net-attach1 Interface:net1 IPs:[172.17.20.12] Mac:26:c4:95:37:a4:d8 Default:false DNS:
DeviceInfo:<nil>}
2024-01-29T10:06:08Z [verbose] the NAD's config: {{ "cniVersion": "0.3.1", "type": "macvlan", "master": "br-ex", "mode": "bridge", "ipam":
}}
2024-01-29T10:06:08Z [debug] Used defaults from parsed flat file config @ /host/etc/cni/net.d/whereabouts.d/whereabouts.conf
2024-01-29T10:06:08Z [verbose] result of garbage collecting pods: failed to get the IPPool data: ippool.whereabouts.cni.cncf.io "172.17.20.0-24" not found
2024-01-29T10:06:08Z [verbose] re-queuing IP address reconciliation request for pod nadtesting/deployment1-1; retry #: 1
2024-01-29T10:06:08Z [verbose] skipped net-attach-def for default network
2024-01-29T10:06:08Z [debug] pod's network status: {Name:nadtesting/macvlan-net-attach1 Interface:net1 IPs:[172.17.20.12] Mac:26:c4:95:37:a4:d8 Default:false DNS:
DeviceInfo:<nil>}
2024-01-29T10:06:08Z [verbose] the NAD's config: {{ "cniVersion": "0.3.1", "type": "macvlan", "master": "br-ex", "mode": "bridge", "ipam":
}}
2024-01-29T10:06:08Z [debug] Used defaults from parsed flat file config @ /host/etc/cni/net.d/whereabouts.d/whereabouts.conf
2024-01-29T10:06:08Z [verbose] result of garbage collecting pods: failed to get the IPPool data: ippool.whereabouts.cni.cncf.io "172.17.20.0-24" not found
2024-01-29T10:06:08Z [verbose] re-queuing IP address reconciliation request for pod nadtesting/deployment1-1; retry #: 2
2024-01-29T10:06:08Z [verbose] skipped net-attach-def for default network
2024-01-29T10:06:08Z [debug] pod's network status: {Name:nadtesting/macvlan-net-attach1 Interface:net1 IPs:[172.17.20.12] Mac:26:c4:95:37:a4:d8 Default:false DNS:
DeviceInfo:<nil>}
2024-01-29T10:06:08Z [verbose] the NAD's config: {{ "cniVersion": "0.3.1", "type": "macvlan", "master": "br-ex", "mode": "bridge", "ipam":
}}
2024-01-29T10:06:08Z [debug] Used defaults from parsed flat file config @ /host/etc/cni/net.d/whereabouts.d/whereabouts.conf
2024-01-29T10:06:08Z [verbose] result of garbage collecting pods: failed to get the IPPool data: ippool.whereabouts.cni.cncf.io "172.17.20.0-24" not found
2024-01-29T10:06:08Z [error] dropping pod [nadtesting/deployment1-1] deletion out of the queue - could not reconcile IP: failed to get the IPPool data: ippool.whereabouts.cni.cncf.io "172.17.20.0-24" not found
2024-01-29T10:06:08Z [verbose] Event(v1.ObjectReference
): type: 'Warning' reason: 'IPAddressGarbageCollectionFailed' failed to garbage collect addresses for pod nadtesting/deployment1-1
~~~
Expected results:{code:none}
Additional info:
- is duplicated by
-
OCPBUGS-23199 Whereabouts reconciler errors with "IPPool not found" on pod deletion although the IPPool exists
- Closed