-
Bug
-
Resolution: Can't Do
-
Critical
-
None
-
4.13.0
-
No
-
Sprint 233 - Update&Remoting
-
1
-
Approved
-
False
-
Since the 9.2 merge, we've seen some jobs fail bringing up control plane nodes; the root cause seems to be some sort of intermittent race.
This is a pretty critical bug because it will manifest in flakes all over CI, and of course for customer installs.
In this aggregated job, take a look at this console log :
OK ] Reached target System Initialization. [ 8.062341] systemd[1]: Reached target Basic System. [ OK ] Reached target Basic System. ... [ 8.312057] ignition[789]: Ignition 2.14.0 [ 8.312869] ignition[789]: Stage: fetch-offline [ 8.314074] ignition[789]: reading system config file "/usr/lib/ignition/base.d/00-core.ign" [ 8.315679] ignition[789]: no config dir at "/usr/lib/ignition/base.platform.d/gcp" [[ 8.317072] systemd[1]: Started Ignition (fetch-offline). OK 8.318130] ignition[789]: no config URL provided m] [ 8.319497] ignition[789]: reading system config file "/usr/lib/ignition/user.ign" Started Ignition[ 8.320980] ignition[789]: no config at "/usr/lib/ignition/user.ign" (fetch-offline)[ 8.322690] ignition[789]: failed to fetch config: resource requires networking .[ 8.324281] ignition[789]: Ignition finished successfully ... Starting CoreOS [ 8.330258] systemd[1]: Starting CoreOS Enable Network... Enable Network... [ 8.560359] systemd[1]: Started CoreOS Enable Network. [ OK ] Started CoreOS Enable Network. [ 8.566643] systemd[1]: Starting Copy CoreOS Firstboot Networking Config... Starting Copy CoreOS Firstboot Networking Config... Starting Ignitio[ 8.573609] systemd[1]: Starting Ignition (fetch)... n (fetch)... [ 8.593377] EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null) [ 8.600613] coreos-copy-firstboot-network[813]: info: no files to copy from /mnt/boot_partition/coreos-firstboot-network; skipping [ OK 8.607229] systemd[1]: Started Copy CoreOS Firstboot Networking Config. 0m] Started Copy CoreOS Firstboot Networking Config. Starting dracut initque[ 8.614162] ignition[816]: Ignition 2.14.0 ue hook... [ 8.615065] systemd[1]: Starting dracut initqueue hook... [ 8.616017] ignition[816]: Stage: fetch [ 8.616879] ignition[816]: reading system config file "/usr/lib/ignition/base.d/00-core.ign" [ 8.619374] ignition[816]: no config dir at "/usr/lib/ignition/base.platform.d/gcp" [ 8.621187] ignition[816]: no config URL provided [ 8.622293] ignition[816]: reading system config file "/usr/lib/ignition/user.ign" [ 8.624825] ignition[816]: no config at "/usr/lib/ignition/user.ign" [ 8.626288] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #1 [ 8.628861] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 8.819648] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #2 [ 8.821659] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 9.021300] NetworkManager[842]: <info> [1678822681.0846] NetworkManager (version 1.36.0-12.el8_6) is starting... (for the first time) [ 9.024892] NetworkManager[842]: <info> [1678822681.0847] Read config: /etc/NetworkManager/NetworkManager.conf [ 9.026661] NetworkManager[842]: <info> [1678822681.0869] auth[0x562982c3a2a0]: create auth-manager: D-Bus connection not available. Polkit is disabled and only root will be authorized. [ 9.029495] NetworkManager[842]: <info> [1678822681.0886] manager[0x562982c6f020]: monitoring kernel firmware directory '/lib/firmware'. [ 9.030969] IPv6: ADDRCONF(NETDEV_UP): ens4: link is not ready [ 9.032095] NetworkManager[842]: <info> [1678822681.0888] hostname: hostname: hostnamed not used as proxy creation failed with: Could not connect: No such file or directory [ 9.035715] NetworkManager[842]: <info> [1678822681.0890] dns-mgr[0x562982c66120]: init: dns=default,systemd-resolved rc-manager=symlink [ 9.037935] NetworkManager[842]: <info> [1678822681.0890] policy: set-hostname: set hostname to 'localhost.localdomain' (no hostname found) [ 9.040488] NetworkManager[842]: <info> [1678822681.0921] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.36.0-12.el8_6/libnm-device-plugin-team.so) [ 9.044079] NetworkManager[842]: <info> [1678822681.0922] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file [ 9.046290] NetworkManager[842]: <info> [1678822681.0922] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file [ 9.048802] NetworkManager[842]: <info> [1678822681.0922] manager: Networking is enabled by state file [ 9.050602] NetworkManager[842]: <warn> [1678822681.0926] ifcfg-rh: dbus: don't use D-Bus for com.redhat.ifcfgrh1 service [ 9.053141] NetworkManager[842]: <info> [1678822681.0927] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.36.0-12.el8_6/libnm-settings-plugin-ifcfg-rh.so") [ 9.056089] NetworkManager[842]: <info> [1678822681.0927] settings: Loaded settings plugin: keyfile (internal) [ 9.057864] NetworkManager[842]: <info> [1678822681.0940] dhcp-init: Using DHCP client 'internal' [ 9.059631] NetworkManager[842]: <info> [1678822681.0940] device (lo): carrier: link connected [ 9.061566] NetworkManager[842]: <info> [1678822681.0941] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1) [ 9.063680] NetworkManager[842]: <info> [1678822681.0945] manager: (ens4): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2) [ 9.065781] NetworkManager[842]: <info> [1678822681.0945] device (ens4): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external') [ 9.068830] NetworkManager[842]: <info> [1678822681.0986] device (ens4): carrier: link connected [ 9.070542] NetworkManager[842]: <warn> [1678822681.0992] sleep-monitor-sd: failed to acquire D-Bus proxy: Could not connect: No such file or directory [ 9.073473] NetworkManager[842]: <info> [1678822681.0993] device (ens4): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed') [ 9.076415] NetworkManager[842]: <info> [1678822681.0995] policy: auto-activating connection 'Wired Connection' (5077b9d2-9d45-4c92-adfc-81e60947e5be) [ 9.079122] NetworkManager[842]: <info> [1678822681.0998] device (ens4): Activation: starting connection 'Wired Connection' (5077b9d2-9d45-4c92-adfc-81e60947e5be) [ 9.081635] NetworkManager[842]: <info> [1678822681.0999] device (ens4): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed') [ 9.084716] NetworkManager[842]: <info> [1678822681.0999] manager: NetworkManager state is now CONNECTING [ 9.086696] NetworkManager[842]: <info> [1678822681.0999] device (ens4): state change: prepare -> config (reason 'none', sys-iface-state: 'managed') [ 9.090627] NetworkManager[842]: <info> [1678822681.1002] device (ens4): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed') [ 9.094902] NetworkManager[842]: <info> [1678822681.1004] dhcp4 (ens4): activation: beginning transaction (timeout in 90 seconds) [ 9.220439] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #3 [ 9.222722] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 10.021092] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #4 [ 10.024811] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 11.621883] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #5 [ 11.624764] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 14.824475] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #6 [ 14.826467] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 19.825129] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #7 [ 19.827691] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 24.830414] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #8 [ 24.841491] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 29.831270] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #9 [ 29.842424] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 34.835971] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #10 [ 34.846923] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 39.836597] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #11 [ 39.847754] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 44.841306] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #12 [ 44.852139] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 49.842697] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #13 [ 49.853954] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 54.847479] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #14 [ 54.858231] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 59.848798] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #15 [ 59.859589] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 64.853158] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #16 [ 64.863946] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 69.854547] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #17 [ 69.865597] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 74.859178] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #18 [ 74.870231] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 79.859997] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #19 [ 79.870775] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 84.864265] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #20 [ 84.875176] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 89.865856] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #21 [ 89.877574] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 94.871311] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #22 [ 94.882341] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ 99.015824] NetworkManager[842]: <info> [1678822771.0791] device (ens4): state change: ip-config -> failed (reason 'ip-config-unavailable', sys-iface-state: 'managed') [ 99.031548] NetworkManager[842]: <info> [1678822771.0793] manager: NetworkManager state is now DISCONNECTED [ 99.041891] NetworkManager[842]: <warn> [1678822771.0793] device (ens4): Activation: failed for connection 'Wired Connection' [ 99.053770] NetworkManager[842]: <info> [1678822771.0795] device (ens4): state change: failed -> disconnected (reason 'none', sys-iface-state: 'managed') [ 99.070233] NetworkManager[842]: <info> [1678822771.1339] dhcp4 (ens4): canceled DHCP transaction [ 99.079627] NetworkManager[842]: <info> [1678822771.1346] manager: startup complete [ OK 99.087692] NetworkManager[842]: <info> [1678822771.1346] quitting now that startup is complete 0m] Started drac[ 99.098946] NetworkManager[842]: <info> [1678822771.1351] exiting (success) ut initqueue hoo[ 99.108172] systemd[1]: Started dracut initqueue hook. k. Startin[ 99.118871] systemd[1]: Starting dracut pre-mount hook... g dracut pre-mount hook... [ 99.130366] systemd[1]: Reached target Remote File Systems (Pre). [ OK ] Reached target Remote File Systems (Pre). [ OK 99.144425] systemd[1]: Reached target Remote File Systems. 0m] Reached target Remote File Systems. [ 99.158552] systemd[1]: Started dracut pre-mount hook. [ OK ] Started dracut pre-mount hook. [ 99.872095] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #23 [ 99.882859] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [** ] A start job is running for Ignition (fetch) (1min 40s / no limit)[ 104.876391] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #24 [ 104.887310] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ *** ] A start job is running for Ignition (fetch) (1min 45s / no limit)[ 109.877939] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #25 [ 109.888802] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ *] A start job is running for Ignition (fetch) (1min 50s / no limit)[ 114.882383] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #26 [ 114.893308] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ *** ] A start job is running for Ignition (fetch) (1min 55s / no limit)[ 119.883198] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #27 [ 119.894623] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [** ] A start job is running for Ignition (fetch) (2min / no limit)[ 124.888231] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #28 [ 124.899715] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable [ **] A start job is running for Ignition (fetch) (2min 4s / no limit)[ 128.619392] ignition[816]: failed to fetch config: unable to fetch resource in time [ 128.627747] ignition[816]: failed to acquire config: unable to fetch resource in time [FAILED[ 128.637094] systemd[1]: ignition-fetch.service: Main process exited, code=exited, status=1/FAILURE ] Failed to [ 128.647508] ignition[816]: Ignition failed: unable to fetch resource in time start Ignition ([ 128.656327] systemd[1]: ignition-fetch.service: Failed with result 'exit-code'. fetch).
I thought that in RHEL9 we'd got NM to retry indefinitely in the initramfs?
- relates to
-
COS-1926 Move RHCOS to RHEL 9.2 in OCP 4.13
- Closed