$ ./reproducer-nm-iperf.sh For 5-6 seconds packets are not flowing, we get this state: 45: veth0@if44: mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff link-netns ns1 promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state BACKUP mii_status UP link_failure_count 0 perm_hwaddr ba:67:53:34:45:37 queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 13 ad_actor_oper_port_state_str ad_partner_oper_port_state 61 ad_partner_oper_port_state_str numtxqueues 2 numrxqueues 2 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 47: veth1@if46: mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff link-netns ns1 promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state BACKUP mii_status UP link_failure_count 0 perm_hwaddr 36:99:ba:b2:0f:ba queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 13 ad_actor_oper_port_state_str ad_partner_oper_port_state 61 ad_partner_oper_port_state_str numtxqueues 2 numrxqueues 2 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 48: bond0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 bond mode 802.3ad miimon 100 updelay 0 downdelay 0 peer_notify_delay 0 use_carrier 1 arp_interval 0 arp_missed_max 2 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer2 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1 packets_per_slave 1 lacp_active on lacp_rate slow ad_select stable ad_aggregator 1 ad_num_ports 2 ad_actor_key 15 ad_partner_key 15 ad_partner_mac ca:3c:b5:18:84:a4 ad_actor_sys_prio 65535 ad_user_port_key 0 ad_actor_system 00:00:00:00:00:00 tlb_dynamic_lb 1 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 Then, packets start flowing and we get this state: 45: veth0@if44: mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff link-netns ns1 promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr ba:67:53:34:45:37 queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 61 ad_actor_oper_port_state_str ad_partner_oper_port_state 61 ad_partner_oper_port_state_str numtxqueues 2 numrxqueues 2 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 47: veth1@if46: mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff link-netns ns1 promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 36:99:ba:b2:0f:ba queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 61 ad_actor_oper_port_state_str ad_partner_oper_port_state 61 ad_partner_oper_port_state_str numtxqueues 2 numrxqueues 2 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 48: bond0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 bond mode 802.3ad miimon 100 updelay 0 downdelay 0 peer_notify_delay 0 use_carrier 1 arp_interval 0 arp_missed_max 2 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer2 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1 packets_per_slave 1 lacp_active on lacp_rate slow ad_select stable ad_aggregator 1 ad_num_ports 2 ad_actor_key 15 ad_partner_key 15 ad_partner_mac ca:3c:b5:18:84:a4 ad_actor_sys_prio 65535 ad_user_port_key 0 ad_actor_system 00:00:00:00:00:00 tlb_dynamic_lb 1 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 Diff between them: bond_slave state [-BACKUP-] {+ACTIVE+} ... ad_actor_oper_port_state_str [--] {++} Kernel logs: ---------- Initial release / cleanup --------------- Feb 05 03:01:42 f40server kernel: bond0: (slave veth0): Releasing backup interface Feb 05 03:01:42 f40server kernel: bond0: (slave veth0): the permanent HWaddr of slave - ba:67:53:34:45:37 - is still in use by bond - set the HWaddr of slave to a different address to avoid conflicts Feb 05 03:01:42 f40server kernel: bond0: (slave veth1): Removing an active aggregator Feb 05 03:01:42 f40server kernel: bond0: (slave veth1): Releasing backup interface Feb 05 03:01:42 f40server kernel: bond0 (unregistering): Released all slaves Feb 05 03:01:42 f40server kernel: bond1 (unregistering): (slave veth0_p): Releasing backup interface Feb 05 03:01:42 f40server kernel: bond1 (unregistering): (slave veth1_p): Removing an active aggregator Feb 05 03:01:42 f40server kernel: bond1 (unregistering): (slave veth1_p): Releasing backup interface Feb 05 03:01:42 f40server kernel: bond1 (unregistering): Released all slaves ----------- Start ---------------- Feb 05 03:01:43 f40server kernel: bond1: (slave veth0_p): Enslaving as a backup interface with a down link Feb 05 03:01:43 f40server kernel: bond1: (slave veth1_p): Enslaving as a backup interface with a down link Feb 05 03:01:43 f40server kernel: bond1: Warning: No 802.3ad response from the link partner for any adapters in the bond Feb 05 03:01:43 f40server kernel: bond1: (slave veth0_p): link status definitely up, 10000 Mbps full duplex Feb 05 03:01:43 f40server kernel: bond1: active interface up! Feb 05 03:01:43 f40server kernel: bond0: (slave veth0): Enslaving as a backup interface with an up link Feb 05 03:01:43 f40server kernel: bond0: (slave veth1): Enslaving as a backup interface with an up link Feb 05 03:01:43 f40server kernel: bond1: (slave veth1_p): link status definitely up, 10000 Mbps full duplex Feb 05 03:01:43 f40server kernel: bond0: (slave veth0): Removing an active aggregator Feb 05 03:01:43 f40server kernel: bond0: (slave veth0): Releasing backup interface Feb 05 03:01:43 f40server kernel: bond0: (slave veth0): the permanent HWaddr of slave - ba:67:53:34:45:37 - is still in use by bond - set the HWaddr of slave to a different address to avoid conflicts Feb 05 03:01:43 f40server kernel: bond0: (slave veth1): Releasing backup interface Feb 05 03:01:43 f40server kernel: bond0: (slave veth0): Enslaving as a backup interface with an up link Feb 05 03:01:43 f40server kernel: bond0: (slave veth1): Enslaving as a backup interface with an up link ^^^ WHY veth0 and veth1 are removed (Removing an active aggregator) and enslaved again? Answer: because doing `nmcli c up bond0` the last counts as a new activation. It was already active due to autoconnect. When doing it first, the problem still happens. $ ./reproducer-iproute-iperf.sh It loses a small amount of packets during the first second only. First, for some seconds (3-4) we see this state (one ACTIVE and one BACKUP): 89: bond0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 bond mode 802.3ad miimon 100 updelay 0 downdelay 0 peer_notify_delay 0 use_carrier 1 arp_interval 0 arp_missed_max 2 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer2 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1 packets_per_slave 1 lacp_active on lacp_rate slow ad_select stable ad_aggregator 1 ad_num_ports 2 ad_actor_key 15 ad_partner_key 15 ad_partner_mac ca:3c:b5:18:84:a4 ad_actor_sys_prio 65535 ad_user_port_key 0 ad_actor_system 00:00:00:00:00:00 tlb_dynamic_lb 1 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 91: veth0@if90: mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff link-netns ns1 promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr ba:67:53:34:45:37 queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 5 ad_actor_oper_port_state_str ad_partner_oper_port_state 69 ad_partner_oper_port_state_str numtxqueues 2 numrxqueues 2 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 93: veth1@if92: mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff link-netns ns1 promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state BACKUP mii_status UP link_failure_count 0 perm_hwaddr 36:99:ba:b2:0f:ba queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 5 ad_actor_oper_port_state_str ad_partner_oper_port_state 69 ad_partner_oper_port_state_str numtxqueues 2 numrxqueues 2 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 Then, for ~1s we see this state (both as BACKUP): 89: bond0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 bond mode 802.3ad miimon 100 updelay 0 downdelay 0 peer_notify_delay 0 use_carrier 1 arp_interval 0 arp_missed_max 2 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer2 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1 packets_per_slave 1 lacp_active on lacp_rate slow ad_select stable ad_aggregator 1 ad_num_ports 2 ad_actor_key 15 ad_partner_key 15 ad_partner_mac ca:3c:b5:18:84:a4 ad_actor_sys_prio 65535 ad_user_port_key 0 ad_actor_system 00:00:00:00:00:00 tlb_dynamic_lb 1 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 91: veth0@if90: mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff link-netns ns1 promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state BACKUP mii_status UP link_failure_count 0 perm_hwaddr ba:67:53:34:45:37 queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 13 ad_actor_oper_port_state_str ad_partner_oper_port_state 197 ad_partner_oper_port_state_str numtxqueues 2 numrxqueues 2 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 93: veth1@if92: mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff link-netns ns1 promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state BACKUP mii_status UP link_failure_count 0 perm_hwaddr 36:99:ba:b2:0f:ba queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 13 ad_actor_oper_port_state_str ad_partner_oper_port_state 197 ad_partner_oper_port_state_str numtxqueues 2 numrxqueues 2 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 Then, finally we get this state (same state that we get with NM): 89: bond0: mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 bond mode 802.3ad miimon 100 updelay 0 downdelay 0 peer_notify_delay 0 use_carrier 1 arp_interval 0 arp_missed_max 2 arp_validate none arp_all_targets any primary_reselect always fail_over_mac none xmit_hash_policy layer2 resend_igmp 1 num_grat_arp 1 all_slaves_active 0 min_links 0 lp_interval 1 packets_per_slave 1 lacp_active on lacp_rate slow ad_select stable ad_aggregator 1 ad_num_ports 2 ad_actor_key 15 ad_partner_key 15 ad_partner_mac ca:3c:b5:18:84:a4 ad_actor_sys_prio 65535 ad_user_port_key 0 ad_actor_system 00:00:00:00:00:00 tlb_dynamic_lb 1 numtxqueues 16 numrxqueues 16 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 91: veth0@if90: mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff link-netns ns1 promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr ba:67:53:34:45:37 queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 61 ad_actor_oper_port_state_str ad_partner_oper_port_state 61 ad_partner_oper_port_state_str numtxqueues 2 numrxqueues 2 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 93: veth1@if92: mtu 1500 qdisc noqueue master bond0 state UP mode DEFAULT group default qlen 1000 link/ether ba:67:53:34:45:37 brd ff:ff:ff:ff:ff:ff link-netns ns1 promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state ACTIVE mii_status UP link_failure_count 0 perm_hwaddr 36:99:ba:b2:0f:ba queue_id 0 prio 0 ad_aggregator_id 1 ad_actor_oper_port_state 61 ad_actor_oper_port_state_str ad_partner_oper_port_state 61 ad_partner_oper_port_state_str numtxqueues 2 numrxqueues 2 gso_max_size 65536 gso_max_segs 65535 tso_max_size 524280 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536 ^^ The main difference with the "nm" tests is that here the bond slaves start without the `in_sync` flag. With NM, they get it since the beginning. Kernel logs: ------------ Initial cleanup ---------------- Feb 05 03:05:39 f40server kernel: bond0: active interface up! Feb 05 03:05:39 f40server kernel: bond0: option arp_missed_max: mode dependency failed, not supported in mode 802.3ad(4) Feb 05 03:05:39 f40server kernel: bond0: option lacp_active: unable to set because the bond device is up Feb 05 03:05:39 f40server kernel: bond0: (slave veth0): Releasing backup interface Feb 05 03:05:39 f40server kernel: bond0: (slave veth0): the permanent HWaddr of slave - ba:67:53:34:45:37 - is still in use by bond - set the HWaddr of slave to a different address to avoid conflicts Feb 05 03:05:39 f40server kernel: bond0: active interface up! Feb 05 03:05:39 f40server kernel: bond0: (slave veth1): Removing an active aggregator Feb 05 03:05:39 f40server kernel: bond0: (slave veth1): Releasing backup interface Feb 05 03:05:40 f40server kernel: bond1 (unregistering): (slave veth0_p): Releasing backup interface Feb 05 03:05:40 f40server kernel: bond1 (unregistering): (slave veth1_p): Removing an active aggregator Feb 05 03:05:40 f40server kernel: bond1 (unregistering): (slave veth1_p): Releasing backup interface Feb 05 03:05:40 f40server kernel: bond1 (unregistering): Released all slaves Feb 05 03:05:40 f40server kernel: bond0 (unregistering): Released all slaves ------------ Test ---------------------- Feb 05 03:05:41 f40server kernel: bond1: (slave veth0_p): Enslaving as a backup interface with a down link Feb 05 03:05:41 f40server kernel: bond1: (slave veth1_p): Enslaving as a backup interface with a down link Feb 05 03:05:41 f40server kernel: bond0: (slave veth0): Enslaving as a backup interface with an up link Feb 05 03:05:41 f40server kernel: bond0: (slave veth1): Enslaving as a backup interface with an up link Feb 05 03:05:41 f40server kernel: bond1: Warning: No 802.3ad response from the link partner for any adapters in the bond Feb 05 03:05:41 f40server kernel: bond1: (slave veth0_p): link status definitely up, 10000 Mbps full duplex Feb 05 03:05:41 f40server kernel: bond1: (slave veth1_p): link status definitely up, 10000 Mbps full duplex Feb 05 03:05:41 f40server kernel: bond1: active interface up!