-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.18
-
None
-
Moderate
-
No
-
False
-
-
Release Note Not Required
-
In Progress
This is a clone of issue OCPBUGS-46072. The following is the description of the original issue:
—
Description of problem:
With balance-slb and nmstate a node got stuck on reboot.
[root@master-1 core]# systemctl list-jobs JOB UNIT TYPE STATE 307 wait-for-br-ex-up.service start running 341 afterburn-checkin.service start waiting 187 multi-user.target start waiting 186 graphical.target start waiting 319 crio.service start waiting 292 kubelet.service start waiting 332 afterburn-firstboot-checkin.service start waiting 306 node-valid-hostname.service start waiting 293 kubelet-dependencies.target start waiting 321 systemd-update-utmp-runlevel.service start waiting systemctl status wait-for-br-ex-up.service Dec 10 20:11:39 master-1.ostest.test.metalkube.org systemd[1]: Starting Wait for br-ex up event from NetworkManager...
Version-Release number of selected component (if applicable):
4.18.0-0.nightly-2024-12-04-113014
How reproducible:
Sometimes
Steps to Reproduce:
1. create nmstate config
interfaces: - name: bond0 type: bond state: up copy-mac-from: eno2 ipv4: enabled: false link-aggregation: mode: balance-xor options: xmit_hash_policy: vlan+srcmac balance-slb: 1 port: - eno2 - eno3 - name: br-ex type: ovs-bridge state: up ipv4: enabled: false dhcp: false ipv6: enabled: false dhcp: false bridge: port: - name: bond0 - name: br-ex - name: br-ex type: ovs-interface state: up copy-mac-from: eno2 ipv4: enabled: true address: - ip: "192.168.111.111" prefix-length: 24 ipv6: enabled: false dhcp: false - name: eno1 type: interface state: up ipv4: enabled: false ipv6: enabled: false dns-resolver: config: server: - 192.168.111.1 routes: config: - destination: 0.0.0.0/0 next-hop-address: 192.168.111.1 next-hop-interface: br-ex
2. reboot
3.
Actual results:
systemctl status wait-for-br-ex-up.service Dec 10 20:11:39 master-1.ostest.test.metalkube.org systemd[1]: Starting Wait for br-ex up event from NetworkManager...
bond0 fails, network is in odd state
[root@master-1 core]# ip -c a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens2f0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 90:e2:ba:ca:9f:28 brd ff:ff:ff:ff:ff:ff
altname enp181s0f0
inet6 fe80::92e2:baff:feca:9f28/64 scope link noprefixroute
valid_lft forever preferred_lft forever
3: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 30:d0:42:56:66:bb brd ff:ff:ff:ff:ff:ff
altname enp23s0f0
4: ens2f1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 90:e2:ba:ca:9f:29 brd ff:ff:ff:ff:ff:ff
altname enp181s0f1
inet6 fe80::92e2:baff:feca:9f29/64 scope link noprefixroute
valid_lft forever preferred_lft forever
5: eno2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000
link/ether 30:d0:42:56:66:bc brd ff:ff:ff:ff:ff:ff
altname enp23s0f1
6: eno3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 30:d0:42:56:66:bd brd ff:ff:ff:ff:ff:ff
altname enp23s0f2
inet 192.168.111.34/24 brd 192.168.111.255 scope global dynamic noprefixroute eno3
valid_lft 3576sec preferred_lft 3576sec
inet6 fe80::32d0:42ff:fe56:66bd/64 scope link noprefixroute
valid_lft forever preferred_lft forever
7: eno4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 30:d0:42:56:66:be brd ff:ff:ff:ff:ff:ff
altname enp23s0f3
8: ovs-system: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
link/ether 56:92:14:97:ed:10 brd ff:ff:ff:ff:ff:ff
9: ovn-k8s-mp0: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
link/ether ae:b9:9e:dc:17:d1 brd ff:ff:ff:ff:ff:ff
10: genev_sys_6081: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 65000 qdisc noqueue master ovs-system state UNKNOWN group default qlen 1000
link/ether e6:68:4d:df:e0:bd brd ff:ff:ff:ff:ff:ff
inet6 fe80::e468:4dff:fedf:e0bd/64 scope link
valid_lft forever preferred_lft forever
11: br-int: <BROADCAST,MULTICAST> mtu 1400 qdisc noop state DOWN group default qlen 1000
link/ether 32:5b:1f:35:ce:f5 brd ff:ff:ff:ff:ff:ff
12: bond0: <NO-CARRIER,BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue master ovs-system state DOWN group default qlen 1000
link/ether aa:c8:8c:e3:71:aa brd ff:ff:ff:ff:ff:ff
13: bond0.104@bond0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue master ovs-system state LOWERLAYERDOWN group default qlen 1000
link/ether aa:c8:8c:e3:71:aa brd ff:ff:ff:ff:ff:ff
14: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether 30:d0:42:56:66:bd brd ff:ff:ff:ff:ff:ff
inet 192.168.111.111/24 brd 192.168.111.255 scope global noprefixroute br-ex
valid_lft forever preferred_lft forever
Expected results:
System reboots correctly.
Additional info:
br-ex up/down re-generates the event
[root@master-1 core]# nmcli device down br-ex ; nmcli device up br-ex
- clones
-
OCPBUGS-46072 nmstate: after reboot wait-for-br-ex-up.service stuck
- Verified
- is blocked by
-
OCPBUGS-46072 nmstate: after reboot wait-for-br-ex-up.service stuck
- Verified
- links to