-
Task
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
False
-
-
False
-
-
rhel-9
-
None
-
-
-
OVN FDP Sprint 13
-
1
This ticket is tracking the QE verification effort for the solution to the problem described below.
Problem Description: Clearly explain the issue.
With ovn25.09-25.09.1-11.el9fdp ovn-northd hits an assertion failure:
2025-12-01T12:22:35.784Z|00109|backtrace|ERR|lib/vlog.c:1309 backtrace: ovn-northd(+0xd3d37) [0x55bfabb7cd37] ovn-northd(+0xc07c3) [0x55bfabb697c3] ovn-northd(+0xb487b) [0x55bfabb5d87b] ovn-northd(+0x705af) [0x55bfabb195af] ovn-northd(+0x70827) [0x55bfabb19827] ovn-northd(+0x709cb) [0x55bfabb199cb] ovn-northd(+0x8792d) [0x55bfabb3092d] ovn-northd(+0x2012f) [0x55bfabac912f] /lib64/libc.so.6(+0x29590) [0x7f070f07f590] /lib64/libc.so.6(__libc_start_main+0x80) [0x7f070f07f640] ovn-northd(+0x21035) [0x55bfabaca035]
Most likely here:
static void handle_od_lbgrp_changes(struct nbrec_load_balancer_group **nbrec_lbgrps, size_t n_nbrec_lbgrps, struct od_lb_data *od_lb_data, struct ed_type_lb_data *lb_data, struct crupdated_od_lb_data *codlb) { struct tracked_lb_data *trk_lb_data = &lb_data->tracked_lb_data; struct uuidset *pre_lbgrp_uuids = od_lb_data->lbgrps; od_lb_data->lbgrps = xzalloc(sizeof *od_lb_data->lbgrps); uuidset_init(od_lb_data->lbgrps); for (size_t i = 0; i < n_nbrec_lbgrps; i++) { const struct uuid *lbgrp_uuid = &nbrec_lbgrps[i]->header_.uuid; uuidset_insert(od_lb_data->lbgrps, lbgrp_uuid); if (!uuidset_find_and_delete(pre_lbgrp_uuids, lbgrp_uuid)) { /* Add this lb group to the tracked data. */ uuidset_insert(&codlb->assoc_lbgrps, lbgrp_uuid); if (!trk_lb_data->has_routable_lb) { struct ovn_lb_group *lbgrp = ovn_lb_group_find(&lb_data->lbgrps, lbgrp_uuid); ovs_assert(lbgrp); <<<<<<<<<<<<<<<<< trk_lb_data->has_routable_lb |= lbgrp->has_routable_lb; } } } if (!uuidset_is_empty(pre_lbgrp_uuids)) { trk_lb_data->has_dissassoc_lbgrps_from_od = true; } uuidset_destroy(pre_lbgrp_uuids); free(pre_lbgrp_uuids); }
Impact Assessment: Describe the severity and impact (e.g., network down,availability of a workaround, etc.).
northd crash
Software Versions: Specify the exact versions in use (e.g.,openvswitch3.1-3.1.0-147.el8fdp).
ovn25.09-25.09.1-11.el9fdp
Issue Type: Indicate whether this is a new issue or a regression (if a regression, state the last known working version).
Unknown
Reproducibility: Confirm if the issue can be reproduced consistently. If not, describe how often it occurs.
Sometimes in OCP CI, e.g.: https://github.com/openshift/ovn-kubernetes/pull/2881
Reproduction Steps: Provide detailed steps or scripts to replicate the issue.
Unknown, working on a reproducer.
Expected Behavior: Describe what should happen under normal circumstances.
Northd should not crash.
Observed Behavior: Explain what actually happens.
Troubleshooting Actions: Outline the steps taken to diagnose or resolve the issue so far.
Logs: If you collected logs please provide them (e.g. sos report, /var/log/openvswitch/* , testpmd console)
Must-gather (NB/SB DBs were unfortunately compacted so they're not too useful):
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/pr-logs/pull/openshift_ovn-kubernetes/2881/pull-ci-openshift-ovn-kubernetes-master-e2e-metal-ipi-ovn-dualstack-bgp-local-gw/1995423294387392512/artifacts/e2e-metal-ipi-ovn-dualstack-bgp-local-gw/gather-must-gather/artifacts/must-gather.tar