-
Bug
-
Resolution: Duplicate
-
Undefined
-
None
-
4.13.z
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
No
-
None
-
None
-
NHE Sprint 234
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
When newNodeProxyHealthzServer() results in an error a signal SIGSEGV: segmentation violation occurs when trying to stop the DefaultNodeNetworkController.
Version-Release number of selected component (if applicable):
How reproducible:
Every-time and error occurs in the health check (in the case of a DPU-mode cluster, this occurs every time on the pods generated by the DPU-Network-operator
Steps to Reproduce:
1. Create an error in newNodeProxyHealthzServer (i.e. remove the POD_NAME variable, or deploy a DPU-mode cluster) 2. Inspect the logs of the failing ovnkube-node
Actual results:
I0405 20:21:34.671364 2481771 default_node_network_controller.go:123] Enable node proxy healthz server on 0.0.0.0:10256
I0405 20:21:34.671640 2481771 metrics.go:506] Stopping metrics server 127.0.0.1:29103
I0405 20:21:34.671760 2481771 reflector.go:227] Stopping reflector *v1.Pod (0s) from k8s.io/client-go/informers/factory.go:150
I0405 20:21:34.671764 2481771 metrics.go:502] Metrics server has stopped serving at address "127.0.0.1:29103"
I0405 20:21:34.671923 2481771 reflector.go:227] Stopping reflector *v1.Service (0s) from k8s.io/client-go/informers/factory.go:150
I0405 20:21:34.672033 2481771 reflector.go:227] Stopping reflector *v1.EndpointSlice (0s) from k8s.io/client-go/informers/factory.go:150
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x14f1580]goroutine 1 [running]:
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-controller-manager.(*nodeNetworkControllerManager).Stop(0x4000445800)
/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/network-controller-manager/node_network_controller_manager.go:177 +0x50
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-controller-manager.(*nodeNetworkControllerManager).Start.func1()
/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/network-controller-manager/node_network_controller_manager.go:142 +0x2c
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-controller-manager.(*nodeNetworkControllerManager).Start(0x4000445800, {0x1d745e8, 0x400061fb00})
/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/network-controller-manager/node_network_controller_manager.go:156 +0x2b8
main.runOvnKube({0x1d745e8, 0x400061fb00}, 0x4000137b60, 0x40000b3380, {0x1d731b8, 0x400013bdc0})
/go/src/github.com/openshift/ovn-kubernetes/go-controller/cmd/ovnkube/ovnkube.go:508 +0x988
main.startOvnKube(0x400005bc40, 0x40000924c0)
/go/src/github.com/openshift/ovn-kubernetes/go-controller/cmd/ovnkube/ovnkube.go:302 +0x734
Expected results:
I0405 16:59:13.854176 3365188 network_attach_def_controller.go:167] Shutting down node-network-controller-manager NAD controller E0405 16:59:13.854211 3365188 ovnkube.go:383] Failed to start ovnkube node network controller manager: could not create node proxy healthz server: found empty env variable POD_NAME F0405 16:59:13.854389 3365188 ovnkube.go:134] could not create node proxy healthz server: found empty env variable POD_NAME
Additional info: