After connecting to a freshly created local database, ovn-controller crashes.
# cat reproducer.sh #!/bin/bash set -x DIR=/tmp/test-dir cleanup () { if test -f ${DIR}/ovsdb-server.pid; then kill $(cat ${DIR}/ovsdb-server.pid) || true fi } trap cleanup 0 1 2 3 13 14 15 rm -rf ${DIR} mkdir ${DIR} export OVS_RUNDIR=${DIR} export OVS_LOGDIR=${DIR} export OVN_RUNDIR=${DIR} export OVN_LOGDIR=${DIR} ovsdb-tool create ${DIR}/conf.db /usr/share/openvswitch/vswitch.ovsschema ovsdb-server --detach --no-chdir --pidfile --log-file \ -vconsole:off -vsyslog:off \ --remote=punix:${DIR}/db.sock ${DIR}/conf.db #ovs-vsctl --db=unix:${DIR}/db.sock --no-wait init gdb --args ovn-controller --no-chdir --log-file \ -vsyslog:off -vconsole:info \ unix:${DIR}/db.sock <<< " run backtrace frame function main echo print cfg\n print cfg quit y "
Result:
(gdb) Starting program: /usr/bin/ovn-controller --no-chdir --log-file -vsyslog:off -vconsole:info unix:/tmp/test-dir/db.sock 2024-07-23T19:59:38Z|00001|vlog|INFO|opened log file /tmp/test-dir/ovn-controller.log [New Thread 0x7ffff6f676c0 (LWP 2947)] [New Thread 0x7ffff5f656c0 (LWP 2949)] [New Thread 0x7ffff67666c0 (LWP 2948)] 2024-07-23T19:59:38Z|00002|reconnect|INFO|unix:/tmp/test-dir/db.sock: connecting... 2024-07-23T19:59:38Z|00003|reconnect|INFO|unix:/tmp/test-dir/db.sock: connected [New Thread 0x7ffff57236c0 (LWP 2950)] 2024-07-23T19:59:38Z|00004|main|INFO|OVN internal version is : [24.03.90-20.34.0-73.6] 2024-07-23T19:59:38Z|00005|main|INFO|OVS IDL reconnected, force recompute. 2024-07-23T19:59:38Z|00006|main|INFO|OVNSB IDL reconnected, force recompute. 2024-07-23T19:59:38Z|00007|chassis|WARN|'system-id' in Open_vSwitch database is missing. Thread 1 "ovn-controller" received signal SIGSEGV, Segmentation fault. shash_find__ (sh=0x180, name=0x5555556903cb "vlan-limit", name_len=10, hash=777389702) at ovs-bf1b16364b3f01b0ff5f2f6e76842e666226a17b/lib/shash.c:225 225 HMAP_FOR_EACH_WITH_HASH (node, node, hash, &sh->map) { (gdb) #0 shash_find__ (sh=0x180, name=0x5555556903cb "vlan-limit", name_len=10, hash=777389702) at ovs-bf1b16364b3f01b0ff5f2f6e76842e666226a17b/lib/shash.c:225 #1 0x00005555556328a1 in smap_get_node (smap=0x180, key=0x5555556903cb "vlan-limit") at ovs-bf1b16364b3f01b0ff5f2f6e76842e666226a17b/lib/smap.c:217 #2 smap_get_def (smap=0x180, key=0x5555556903cb "vlan-limit", def=0x0) at ovs-bf1b16364b3f01b0ff5f2f6e76842e666226a17b/lib/smap.c:208 #3 smap_get (smap=0x180, key=0x5555556903cb "vlan-limit") at ovs-bf1b16364b3f01b0ff5f2f6e76842e666226a17b/lib/smap.c:200 #4 smap_get_int (smap=0x180, key=0x5555556903cb "vlan-limit", def=-1) at ovs-bf1b16364b3f01b0ff5f2f6e76842e666226a17b/lib/smap.c:240 #5 0x0000555555560003 in main (argc=<optimized out>, argv=<optimized out>) at controller/ovn-controller.c:5430 (gdb) #5 0x0000555555560003 in main (argc=<optimized out>, argv=<optimized out>) at controller/ovn-controller.c:5430 5430 int vlan_limit = smap_get_int( (gdb) print cfg (gdb) $1 = (const struct ovsrec_open_vswitch *) 0x0
Uncommenting the init line in the reproducer script makes it not crash.
This should not be a frequent event as local databases are usually not empty, but that can technically happen, and crashing is never a correct behavior.