Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-60992

nmstate-handler in CrashLoopBackOff - failed to create listener: listen tcp :8089: bind: address already in use

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Critical
    • None
    • None
    • None
    • Proposed
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

          nmstate-handler pod fails to start on one node with next error logged:
      {"level":"info","ts":"2025-08-28T08:30:22.253Z","msg":"Shutdown signal received, waiting for all workers to finish","controller":"NodeNetworkState"}
      {"level":"info","ts":"2025-08-28T08:30:22.253Z","msg":"Shutdown signal received, waiting for all workers to finish","controller":"nodenetworkconfigurationenactment","controllerGroup":"nmstate.io","controllerKind":"NodeNetworkConfigurationEnactment"}
      {"level":"info","ts":"2025-08-28T08:30:22.253Z","msg":"All workers finished","controller":"nodenetworkconfigurationenactment","controllerGroup":"nmstate.io","controllerKind":"NodeNetworkConfigurationEnactment"}
      {"level":"info","ts":"2025-08-28T08:30:22.253Z","msg":"All workers finished","controller":"NodeNetworkConfigurationPolicy"}
      {"level":"info","ts":"2025-08-28T08:30:22.253Z","msg":"All workers finished","controller":"NodeNetworkState"}
      {"level":"info","ts":"2025-08-28T08:30:22.253Z","msg":"Stopping and waiting for caches"}
      {"level":"error","ts":"2025-08-28T08:30:22.255Z","logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"Timeout: failed waiting for *v1.NodeNetworkConfigurationPolicy Informer to sync","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1.1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:76\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:53\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:54\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33\nsigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:64"}
      {"level":"error","ts":"2025-08-28T08:30:22.255Z","logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"Timeout: failed waiting for *v1beta1.NodeNetworkConfigurationEnactment Informer to sync","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1.1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:76\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:53\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:54\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33\nsigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:64"}
      {"level":"error","ts":"2025-08-28T08:30:22.255Z","logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"Timeout: failed waiting for *v1.Node Informer to sync","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1.1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:76\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:53\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:54\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33\nsigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:64"}
      {"level":"error","ts":"2025-08-28T08:30:22.255Z","logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"Timeout: failed waiting for *v1.Node Informer to sync","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1.1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:76\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:53\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:54\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33\nsigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:64"}
      {"level":"error","ts":"2025-08-28T08:30:22.255Z","logger":"controller-runtime.source.EventHandler","msg":"failed to get informer from cache","error":"Timeout: failed waiting for *v1beta1.NodeNetworkState Informer to sync","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1.1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:76\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:53\nk8s.io/apimachinery/pkg/util/wait.loopConditionUntilContext\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/loop.go:54\nk8s.io/apimachinery/pkg/util/wait.PollUntilContextCancel\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/k8s.io/apimachinery/pkg/util/wait/poll.go:33\nsigs.k8s.io/controller-runtime/pkg/internal/source.(*Kind[...]).Start.func1\n\t/go/src/github.com/openshift/kubernetes-nmstate/vendor/sigs.k8s.io/controller-runtime/pkg/internal/source/kind.go:64"}
      {"level":"info","ts":"2025-08-28T08:30:22.256Z","msg":"Stopping and waiting for webhooks"}
      {"level":"info","ts":"2025-08-28T08:30:22.256Z","msg":"Stopping and waiting for HTTP servers"}
      {"level":"info","ts":"2025-08-28T08:30:22.256Z","msg":"Wait completed, proceeding to shutdown the manager"}
      {"level":"error","ts":"2025-08-28T08:30:22.256Z","logger":"setup","msg":"problem running manager","error":"failed to start metrics server: failed to create listener: listen tcp :8089: bind: address already in use","stacktrace":"main.mainHandler\n\t/go/src/github.com/openshift/kubernetes-nmstate/cmd/handler/main.go:180\nmain.main\n\t/go/src/github.com/openshift/kubernetes-nmstate/cmd/handler/main.go:89\nruntime.main\n\t/usr/lib/golang/src/runtime/proc.go:283"}
      

      Version-Release number of selected component (if applicable):

      kubernetes-nmstate-operator.4.20.0-202508070512
      OCP 4.20.0-ec.6

      How reproducible:

          so far 100% during deployment

      Steps to Reproduce:

          1. Deploy and configure baremetal dualastack cluster with GitOps ZTP method which has policies to configure networking on the cluster nodes
          2.
          3.
          

      Actual results:

          Policiy responsible for configuring networking fails as nmstate-handler pod fails to start

      Expected results:

          nmstate-handler pods are running on all relevant nodes

      Additional info:

          When checking on the node where pod fails to run the port 8089 is in use by some 'ironic' process:
      
      sudo ss -anp | grep 8089
      u_str ESTAB     0      0                                                    * 848089                          * 848088  users:(("ovn-controller",pid=89848,fd=23))
      
      
      u_str ESTAB     0      0                                                    * 848088                          * 848089  users:(("ovn-controller",pid=89848,fd=22))
      
      
      u_str ESTAB     0      0                                 /run/systemd/private 808929                          * 787028  users:(("systemd",pid=1,fd=18))
      
      
      u_str ESTAB     0      0                                                    * 787028                          * 808929  users:(("ovnkube",pid=90273,fd=16))
      
      
      tcp   LISTEN    0      5                                                [::1]:8089                         [::]:*       users:(("ironic",pid=25213,fd=6))
      

              bnemec@redhat.com Benjamin Nemec
              yprokule@redhat.com Yurii Prokulevych
              None
              None
              Ross Brattain Ross Brattain
              None
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: