Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-63695

Panic in dpu-host mode during start up

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.20.0
    • None
    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • None
    • None
    • None
    • None
    • In Progress
    • Release Note Not Required
    • None
    • None
    • None
    • None
    • None

      What happened?

      #5373 is causing a panic in dpu-host mode during startup as it is trying to remove flows to drop GARP which is not possible in that mode

      What did you expect to happen?

      Dpu host mode should not panic

      How can we reproduce it (as minimally and precisely as possible)?

      Deploy ovnkube-node-dpu-host daemonset and it will panic during startup

      Anything else we need to know?

      This is the stack trace

      2025-09-04T08:34:09.798843553-07:00 stderr F I0904 15:34:09.798785 1377054 ovs.go:160] Exec(1): /usr/bin/ovs-vsctl --timeout=15 --if-exists del-br br-ext
      2025-09-04T08:34:09.803443835-07:00 stderr F I0904 15:34:09.803392 1377054 ovs.go:163] Exec(1): stdout: ""
      2025-09-04T08:34:09.803455425-07:00 stderr F I0904 15:34:09.803406 1377054 ovs.go:164] Exec(1): stderr: "ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory)\n"
      2025-09-04T08:34:09.803458904-07:00 stderr F I0904 15:34:09.803420 1377054 ovs.go:166] Exec(1): err: exit status 1
      2025-09-04T08:34:09.803466794-07:00 stderr F E0904 15:34:09.803432 1377054 default_node_network_controller.go:1272] Deletion of bridge br-ext failed: exit status 1 (ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory)
      2025-09-04T08:34:09.803469724-07:00 stderr F )
      2025-09-04T08:34:09.803472504-07:00 stderr F I0904 15:34:09.803444 1377054 ovs.go:160] Exec(2): /usr/bin/ovs-vsctl --timeout=15 --if-exists del-port br-int int
      2025-09-04T08:34:09.80727876-07:00 stderr F I0904 15:34:09.807196 1377054 ovs.go:163] Exec(2): stdout: ""
      2025-09-04T08:34:09.80729692-07:00 stderr F I0904 15:34:09.807215 1377054 ovs.go:164] Exec(2): stderr: "ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory)\n"
      2025-09-04T08:34:09.807302339-07:00 stderr F I0904 15:34:09.807225 1377054 ovs.go:166] Exec(2): err: exit status 1
      2025-09-04T08:34:09.807306389-07:00 stderr F E0904 15:34:09.807238 1377054 default_node_network_controller.go:1276] Deletion of port int on br-int failed: exit status 1 (ovs-vsctl: unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory)
      2025-09-04T08:34:09.807310439-07:00 stderr F )
      2025-09-04T08:34:09.828248385-07:00 stderr F I0904 15:34:09.828131 1377054 default_node_network_controller.go:1372] Egress IP for secondary host network is disabled
      2025-09-04T08:34:09.828266885-07:00 stderr F I0904 15:34:09.828158 1377054 link_network_manager.go:116] Link manager is running
      2025-09-04T08:34:09.828272175-07:00 stderr F I0904 15:34:09.828169 1377054 default_node_network_controller.go:1383] Default node network controller initialized and ready.
      2025-09-04T08:34:09.828276155-07:00 stderr F I0904 15:34:09.828178 1377054 node_controller_manager.go:243] Removing flows to drop GARP
      2025-09-04T08:34:09.828280314-07:00 stderr F I0904 15:34:09.828185 1377054 gateway.go:497] Reconciling gateway with updates
      2025-09-04T08:34:09.83148651-07:00 stderr F panic: runtime error: invalid memory address or nil pointer dereference
      2025-09-04T08:34:09.83150425-07:00 stderr F [signal SIGSEGV: segmentation violation code=0x1 addr=0x80 pc=0x2454360]
      2025-09-04T08:34:09.8315093-07:00 stderr F
      2025-09-04T08:34:09.83151396-07:00 stderr F goroutine 12621 [running]:
      2025-09-04T08:34:09.83151799-07:00 stderr F github.com/ovn-org/ovn-kubernetes/go-controller/pkg/node.(*addressManager).ListAddresses(0x4e7d0e0?)
      2025-09-04T08:34:09.831522-07:00 stderr F /builds/sdn/upstream_cicd/ovn-kubernetes/go-controller/pkg/node/node_ip_handler_linux.go:113 +0x60
      2025-09-04T08:34:09.83152595-07:00 stderr F github.com/ovn-org/ovn-kubernetes/go-controller/pkg/node.(*gateway).Reconcile(0xc000acebe0)
      2025-09-04T08:34:09.83152969-07:00 stderr F /builds/sdn/upstream_cicd/ovn-kubernetes/go-controller/pkg/node/gateway.go:498 +0x85
      2025-09-04T08:34:09.831533779-07:00 stderr F github.com/ovn-org/ovn-kubernetes/go-controller/pkg/controllermanager.(*NodeControllerManager).Start(0xc00647c000, {0x3337df0, 0xc000ada6e0}, 0x0)
      2025-09-04T08:34:09.831538189-07:00 stderr F /builds/sdn/upstream_cicd/ovn-kubernetes/go-controller/pkg/controllermanager/node_controller_manager.go:245 +0x7fa
      2025-09-04T08:34:09.831544739-07:00 stderr F main.runOvnKube.func4()
      2025-09-04T08:34:09.831550589-07:00 stderr F /builds/sdn/upstream_cicd/ovn-kubernetes/go-controller/cmd/ovnkube/ovnkube.go:584 +0x465
      2025-09-04T08:34:09.831567329-07:00 stderr F created by main.runOvnKube in goroutine 1
      2025-09-04T08:34:09.831572739-07:00 stderr F /builds/sdn/upstream_cicd/ovn-kubernetes/go-controller/cmd/ovnkube/ovnkube.go:549 +0x695
      2025-09-04T08:34:10.359721194-07:00 stdout F info: Waiting for process_ready ovnkube to come up, waiting 5s ...
      2025-09-04T08:34:15.402480539-07:00 stdout F info: Waiting for process_ready ovnkube to come up, waiting 5s ...
      2025-09-04T08:34:20.446151697-07:00 stdout F error: process_ready ovnkube did not come up, exiting

              itsoiref@redhat.com Igal Tsoiref
              itsoiref@redhat.com Igal Tsoiref
              None
              None
              Anurag Saxena Anurag Saxena
              None
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: