-
Bug
-
Resolution: Duplicate
-
Undefined
-
None
-
4.18
-
Critical
-
No
-
Proposed
-
False
-
Description of problem: Observed ovnkube pod on newly scaled node crashing preventing node to be in NOTREADY state
Version-Release number of selected component (if applicable): Cluster bot build
build 4.18, openshift/ovn-kubernetes#2314,openshift/api#1997
How reproducible: Always
Steps to Reproduce:
1. Build a cluster bot build
2. Launch cluster on that build
3. Create couple of NS's and NADs inside them
4. Scale a node on the cluster
Actual results: ovnkube pod on newly scaled node crashing preventing node to be in NOTREADY state
Expected results: node should be in READY state without CLBO in any ovn pods
Additional info: Excerpt from `oc logs ocnkube-node -c ovnkube-controller` on scaled node
I1015 18:35:03.063817 4850 reflector.go:359] Caches populated for *v1.NetworkAttachmentDefinition from github.com/k8snetworkplumbingwg/network-attachment-definition-client/pkg/client/informers/externalversions/factory.go:117 I1015 18:35:03.089307 4850 reflector.go:296] Starting reflector *v1.UserDefinedNetwork (0s) from github.com/openshift/ovn-kubernetes/go-controller/pkg/crd/userdefinednetwork/v1/apis/informers/externalversions/factory.go:140 I1015 18:35:03.089353 4850 reflector.go:332] Listing and watching *v1.UserDefinedNetwork from github.com/openshift/ovn-kubernetes/go-controller/pkg/crd/userdefinednetwork/v1/apis/informers/externalversions/factory.go:140 I1015 18:35:03.091039 4850 reflector.go:359] Caches populated for *v1.UserDefinedNetwork from github.com/openshift/ovn-kubernetes/go-controller/pkg/crd/userdefinednetwork/v1/apis/informers/externalversions/factory.go:140 I1015 18:35:03.161951 4850 controller.go:103] Adding controller [node-network-controller-manager NAD controller] event handlers I1015 18:35:03.161999 4850 shared_informer.go:313] Waiting for caches to sync for [node-network-controller-manager NAD controller] I1015 18:35:03.162006 4850 shared_informer.go:320] Caches are synced for [node-network-controller-manager NAD controller] I1015 18:35:03.162362 4850 controller.go:127] Starting controller [node-network-controller-manager NAD controller] with 1 workers I1015 18:35:03.163086 4850 network_manager.go:215] [node-network-controller-manager network manager]: syncing all networks I1015 18:35:03.163179 4850 network_manager.go:153] [node-network-controller-manager network manager]: finished syncing network l3-network-e2e-test-networking-udn-zmbjr, took 11.612µs I1015 18:35:03.163258 4850 network_attach_def_controller.go:149] [node-network-controller-manager NAD controller]: shutting down I1015 18:35:03.163725 4850 network_attach_def_controller.go:183] [node-network-controller-manager NAD controller]: finished syncing NAD e2e-test-networking-udn-s28lk/l3-network-e2e-test-networking-udn-s28lk, took 1.260204ms I1015 18:35:03.164667 4850 network_attach_def_controller.go:183] [node-network-controller-manager NAD controller]: finished syncing NAD e2e-test-networking-udn-zmbjr/l3-network-e2e-test-networking-udn-zmbjr, took 908.775µs panic: close of closed channel goroutine 118 [running]: github.com/ovn-org/ovn-kubernetes/go-controller/pkg/controller.(*controller[...]).stop(0x29f4740) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/controller/controller.go:142 +0x25 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/controller.Stop(...) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/controller/controller.go:313 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-attach-def-controller.(*networkManagerImpl).Stop(0xc0008606c0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/network-attach-def-controller/network_manager.go:81 +0x4f github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-attach-def-controller.(*NetAttachDefinitionController).Stop(0xc0004f23f0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/network-attach-def-controller/network_attach_def_controller.go:151 +0x153 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-controller-manager.(*nodeNetworkControllerManager).Stop(0xc0004f2360) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/network-controller-manager/node_network_controller_manager.go:246 +0x4a github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-controller-manager.(*nodeNetworkControllerManager).Start.func1() /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/network-controller-manager/node_network_controller_manager.go:175 +0x25 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-controller-manager.(*nodeNetworkControllerManager).Start(0xc0004f2360, {0x29d4ea8, 0xc00023dae0}) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/network-controller-manager/node_network_controller_manager.go:197 +0x344 main.runOvnKube.func4() /go/src/github.com/openshift/ovn-kubernetes/go-controller/cmd/ovnkube/ovnkube.go:560 +0x2ea created by main.runOvnKube in goroutine 1 /go/src/github.com/openshift/ovn-kubernetes/go-controller/cmd/ovnkube/ovnkube.go:536 +0x58f