-
Bug
-
Resolution: Can't Do
-
Critical
-
None
-
4.13.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
No
-
None
-
Approved
-
Sprint 233 - Update&Remoting
-
1
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Since the 9.2 merge, we've seen some jobs fail bringing up control plane nodes; the root cause seems to be some sort of intermittent race.
This is a pretty critical bug because it will manifest in flakes all over CI, and of course for customer installs.
In this aggregated job, take a look at this console log :
OK ] Reached target System Initialization.
[ 8.062341] systemd[1]: Reached target Basic System.
[ OK ] Reached target Basic System.
...
[ 8.312057] ignition[789]: Ignition 2.14.0
[ 8.312869] ignition[789]: Stage: fetch-offline
[ 8.314074] ignition[789]: reading system config file "/usr/lib/ignition/base.d/00-core.ign"
[ 8.315679] ignition[789]: no config dir at "/usr/lib/ignition/base.platform.d/gcp"
[[ 8.317072] systemd[1]: Started Ignition (fetch-offline).
OK 8.318130] ignition[789]: no config URL provided
m] [ 8.319497] ignition[789]: reading system config file "/usr/lib/ignition/user.ign"
Started Ignition[ 8.320980] ignition[789]: no config at "/usr/lib/ignition/user.ign"
(fetch-offline)[ 8.322690] ignition[789]: failed to fetch config: resource requires networking
.[ 8.324281] ignition[789]: Ignition finished successfully
...
Starting CoreOS [ 8.330258] systemd[1]: Starting CoreOS Enable Network...
Enable Network...
[ 8.560359] systemd[1]: Started CoreOS Enable Network.
[ OK ] Started CoreOS Enable Network.
[ 8.566643] systemd[1]: Starting Copy CoreOS Firstboot Networking Config...
Starting Copy CoreOS Firstboot Networking Config...
Starting Ignitio[ 8.573609] systemd[1]: Starting Ignition (fetch)...
n (fetch)...
[ 8.593377] EXT4-fs (sda3): mounted filesystem with ordered data mode. Opts: (null)
[ 8.600613] coreos-copy-firstboot-network[813]: info: no files to copy from /mnt/boot_partition/coreos-firstboot-network; skipping
[ OK 8.607229] systemd[1]: Started Copy CoreOS Firstboot Networking Config.
0m] Started Copy CoreOS Firstboot Networking Config.
Starting dracut initque[ 8.614162] ignition[816]: Ignition 2.14.0
ue hook...
[ 8.615065] systemd[1]: Starting dracut initqueue hook...
[ 8.616017] ignition[816]: Stage: fetch
[ 8.616879] ignition[816]: reading system config file "/usr/lib/ignition/base.d/00-core.ign"
[ 8.619374] ignition[816]: no config dir at "/usr/lib/ignition/base.platform.d/gcp"
[ 8.621187] ignition[816]: no config URL provided
[ 8.622293] ignition[816]: reading system config file "/usr/lib/ignition/user.ign"
[ 8.624825] ignition[816]: no config at "/usr/lib/ignition/user.ign"
[ 8.626288] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #1
[ 8.628861] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 8.819648] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #2
[ 8.821659] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 9.021300] NetworkManager[842]: <info> [1678822681.0846] NetworkManager (version 1.36.0-12.el8_6) is starting... (for the first time)
[ 9.024892] NetworkManager[842]: <info> [1678822681.0847] Read config: /etc/NetworkManager/NetworkManager.conf
[ 9.026661] NetworkManager[842]: <info> [1678822681.0869] auth[0x562982c3a2a0]: create auth-manager: D-Bus connection not available. Polkit is disabled and only root will be authorized.
[ 9.029495] NetworkManager[842]: <info> [1678822681.0886] manager[0x562982c6f020]: monitoring kernel firmware directory '/lib/firmware'.
[ 9.030969] IPv6: ADDRCONF(NETDEV_UP): ens4: link is not ready
[ 9.032095] NetworkManager[842]: <info> [1678822681.0888] hostname: hostname: hostnamed not used as proxy creation failed with: Could not connect: No such file or directory
[ 9.035715] NetworkManager[842]: <info> [1678822681.0890] dns-mgr[0x562982c66120]: init: dns=default,systemd-resolved rc-manager=symlink
[ 9.037935] NetworkManager[842]: <info> [1678822681.0890] policy: set-hostname: set hostname to 'localhost.localdomain' (no hostname found)
[ 9.040488] NetworkManager[842]: <info> [1678822681.0921] Loaded device plugin: NMTeamFactory (/usr/lib64/NetworkManager/1.36.0-12.el8_6/libnm-device-plugin-team.so)
[ 9.044079] NetworkManager[842]: <info> [1678822681.0922] manager: rfkill: Wi-Fi enabled by radio killswitch; enabled by state file
[ 9.046290] NetworkManager[842]: <info> [1678822681.0922] manager: rfkill: WWAN enabled by radio killswitch; enabled by state file
[ 9.048802] NetworkManager[842]: <info> [1678822681.0922] manager: Networking is enabled by state file
[ 9.050602] NetworkManager[842]: <warn> [1678822681.0926] ifcfg-rh: dbus: don't use D-Bus for com.redhat.ifcfgrh1 service
[ 9.053141] NetworkManager[842]: <info> [1678822681.0927] settings: Loaded settings plugin: ifcfg-rh ("/usr/lib64/NetworkManager/1.36.0-12.el8_6/libnm-settings-plugin-ifcfg-rh.so")
[ 9.056089] NetworkManager[842]: <info> [1678822681.0927] settings: Loaded settings plugin: keyfile (internal)
[ 9.057864] NetworkManager[842]: <info> [1678822681.0940] dhcp-init: Using DHCP client 'internal'
[ 9.059631] NetworkManager[842]: <info> [1678822681.0940] device (lo): carrier: link connected
[ 9.061566] NetworkManager[842]: <info> [1678822681.0941] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
[ 9.063680] NetworkManager[842]: <info> [1678822681.0945] manager: (ens4): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
[ 9.065781] NetworkManager[842]: <info> [1678822681.0945] device (ens4): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
[ 9.068830] NetworkManager[842]: <info> [1678822681.0986] device (ens4): carrier: link connected
[ 9.070542] NetworkManager[842]: <warn> [1678822681.0992] sleep-monitor-sd: failed to acquire D-Bus proxy: Could not connect: No such file or directory
[ 9.073473] NetworkManager[842]: <info> [1678822681.0993] device (ens4): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
[ 9.076415] NetworkManager[842]: <info> [1678822681.0995] policy: auto-activating connection 'Wired Connection' (5077b9d2-9d45-4c92-adfc-81e60947e5be)
[ 9.079122] NetworkManager[842]: <info> [1678822681.0998] device (ens4): Activation: starting connection 'Wired Connection' (5077b9d2-9d45-4c92-adfc-81e60947e5be)
[ 9.081635] NetworkManager[842]: <info> [1678822681.0999] device (ens4): state change: disconnected -> prepare (reason 'none', sys-iface-state: 'managed')
[ 9.084716] NetworkManager[842]: <info> [1678822681.0999] manager: NetworkManager state is now CONNECTING
[ 9.086696] NetworkManager[842]: <info> [1678822681.0999] device (ens4): state change: prepare -> config (reason 'none', sys-iface-state: 'managed')
[ 9.090627] NetworkManager[842]: <info> [1678822681.1002] device (ens4): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
[ 9.094902] NetworkManager[842]: <info> [1678822681.1004] dhcp4 (ens4): activation: beginning transaction (timeout in 90 seconds)
[ 9.220439] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #3
[ 9.222722] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 10.021092] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #4
[ 10.024811] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 11.621883] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #5
[ 11.624764] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 14.824475] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #6
[ 14.826467] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 19.825129] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #7
[ 19.827691] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 24.830414] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #8
[ 24.841491] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 29.831270] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #9
[ 29.842424] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 34.835971] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #10
[ 34.846923] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 39.836597] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #11
[ 39.847754] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 44.841306] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #12
[ 44.852139] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 49.842697] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #13
[ 49.853954] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 54.847479] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #14
[ 54.858231] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 59.848798] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #15
[ 59.859589] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 64.853158] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #16
[ 64.863946] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 69.854547] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #17
[ 69.865597] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 74.859178] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #18
[ 74.870231] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 79.859997] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #19
[ 79.870775] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 84.864265] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #20
[ 84.875176] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 89.865856] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #21
[ 89.877574] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 94.871311] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #22
[ 94.882341] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ 99.015824] NetworkManager[842]: <info> [1678822771.0791] device (ens4): state change: ip-config -> failed (reason 'ip-config-unavailable', sys-iface-state: 'managed')
[ 99.031548] NetworkManager[842]: <info> [1678822771.0793] manager: NetworkManager state is now DISCONNECTED
[ 99.041891] NetworkManager[842]: <warn> [1678822771.0793] device (ens4): Activation: failed for connection 'Wired Connection'
[ 99.053770] NetworkManager[842]: <info> [1678822771.0795] device (ens4): state change: failed -> disconnected (reason 'none', sys-iface-state: 'managed')
[ 99.070233] NetworkManager[842]: <info> [1678822771.1339] dhcp4 (ens4): canceled DHCP transaction
[ 99.079627] NetworkManager[842]: <info> [1678822771.1346] manager: startup complete
[ OK 99.087692] NetworkManager[842]: <info> [1678822771.1346] quitting now that startup is complete
0m] Started drac[ 99.098946] NetworkManager[842]: <info> [1678822771.1351] exiting (success)
ut initqueue hoo[ 99.108172] systemd[1]: Started dracut initqueue hook.
k.
Startin[ 99.118871] systemd[1]: Starting dracut pre-mount hook...
g dracut pre-mount hook...
[ 99.130366] systemd[1]: Reached target Remote File Systems (Pre).
[ OK ] Reached target Remote File Systems (Pre).
[ OK 99.144425] systemd[1]: Reached target Remote File Systems.
0m] Reached target Remote File Systems.
[ 99.158552] systemd[1]: Started dracut pre-mount hook.
[ OK ] Started dracut pre-mount hook.
[ 99.872095] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #23
[ 99.882859] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[** ] A start job is running for Ignition (fetch) (1min 40s / no limit)[ 104.876391] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #24
[ 104.887310] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ *** ] A start job is running for Ignition (fetch) (1min 45s / no limit)[ 109.877939] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #25
[ 109.888802] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ *] A start job is running for Ignition (fetch) (1min 50s / no limit)[ 114.882383] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #26
[ 114.893308] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ *** ] A start job is running for Ignition (fetch) (1min 55s / no limit)[ 119.883198] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #27
[ 119.894623] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[** ] A start job is running for Ignition (fetch) (2min / no limit)[ 124.888231] ignition[816]: GET http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data: attempt #28
[ 124.899715] ignition[816]: GET error: Get "http://169.254.169.254/computeMetadata/v1/instance/attributes/user-data": dial tcp 169.254.169.254:80: connect: network is unreachable
[ **] A start job is running for Ignition (fetch) (2min 4s / no limit)[ 128.619392] ignition[816]: failed to fetch config: unable to fetch resource in time
[ 128.627747] ignition[816]: failed to acquire config: unable to fetch resource in time
[FAILED[ 128.637094] systemd[1]: ignition-fetch.service: Main process exited, code=exited, status=1/FAILURE
] Failed to [ 128.647508] ignition[816]: Ignition failed: unable to fetch resource in time
start Ignition ([ 128.656327] systemd[1]: ignition-fetch.service: Failed with result 'exit-code'.
fetch).
I thought that in RHEL9 we'd got NM to retry indefinitely in the initramfs?
- relates to
-
COS-1926 Move RHCOS to RHEL 9.2 in OCP 4.13
-
- Closed
-