Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-10953

ovnkube-node does not close up correctly

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • None
    • 4.13.0
    • None
    • Moderate
    • No
    • SDN Sprint 234
    • 1
    • Rejected
    • False
    • Hide

      None

      Show
      None

      This is a clone of issue OCPBUGS-10889. The following is the description of the original issue:

      When there is an error initializing the healthz server, ovnkube-node panics and the process hangs without closing up.

      This is due to an invalid interface reference to the network controller which is initialized to a nil implementation instead of a nil interface. The panic happens when it is attempted to be used on Stop(). On that panic, defer for that go-routine are resolved, which includes a wait on the metric server wait group which holds forever as the main context has not been canceled .

      This is a stack trace showcasing the problem:

      runtime.gopark(proc.go:364)
      runtime.goparkunlock(proc.go:369)
      runtime.semacquire1(sema.go:150)
      sync.runtime_Semacquire(sema.go:62)
      sync.(*WaitGroup).Wait(waitgroup.go:139)
      main.startOvnKube.func5(ovnkube.go:293)
      runtime.gopanic(panic.go:890)
      runtime.panicmem(panic.go:260)
      runtime.sigpanic(signal_unix.go:835)
      github.com/ovn-org/ovn-kubernetes/go-controller/pkg/node.(*DefaultNodeNetworkController).Stop(default_node_network_controller.go:785)
      github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-controller-manager.(*nodeNetworkControllerManager).Stop(node_network_controller_manager.go:163)
      github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-controller-manager.(*nodeNetworkControllerManager).Start.func1(node_network_controller_manager.go:129)
      github.com/ovn-org/ovn-kubernetes/go-controller/pkg/network-controller-manager.(*nodeNetworkControllerManager).Start(node_network_controller_manager.go:142)
      main.runOvnKube(ovnkube.go:504)
      main.startOvnKube(ovnkube.go:297)
      main.main.func1(ovnkube.go:112)
      github.com/urfave/cli/v2.(*App).RunContext(app.go:315)
      main.main(ovnkube.go:136)
      runtime.main(proc.go:250)
      runtime.goexit(asm_amd64.s:1594)
      runtime.newproc(<autogenerated>:1)
      

            jcaamano@redhat.com Jaime CaamaƱo Ruiz
            openshift-crt-jira-prow OpenShift Prow Bot
            Anurag Saxena Anurag Saxena
            Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated:
              Resolved: