Details
-
Bug
-
Resolution: Done
-
Critical
-
None
-
4.13, 4.14
-
Important
-
No
-
8
-
Metal Platform 234
-
1
-
Rejected
-
False
-
Description
Description of problem:
We've been investigating persistent errors on our baremetal lab, After the image is written to disk by ironic python agent (using coreos-installer), hosts are failing to reboot to the hard drive during boot they attempt to boot from hard drive but quickly move onto the next entry in the boot order with no error being displayed the first assumption was that some servers in our BM environment were having hardware failure but running some tests on one of our baremetal environments we noticed that this seems to be a regression possibly with coreos-installer
Version-Release number of selected component (if applicable):
the first assumption was that some servers in our BM environment were having hardware failure but running some tests on one of our baremetal environments we noticed that this seems to be a regression
How reproducible:
4.14 nighties are consistently showing the error on at least on baremetal node in the cluster (not usually all of them, 1 or 2 masters usually boots fine), multiple retries 4.13 tried once, which reproduced the problem 4.12 1 attempt, ran fine 4.11 nightlies are fine, I provisioned 6 runs on the same environment without a single failure
Steps to Reproduce:
provisioning clusters with baremetal ipi on baremetal,
Actual results:
once masters go active some will fail to boot and get stuck in a POST reboot loop
Expected results:
All master nodes should boot
Additional info:
I'm using Dell PowerEdge R340, each with 3x PowerEdge R340 Each with 3 x SSDSC2KG240G8R BIOS Version 2.12.2 (also reproduced on older version) iDRAC Firmware Version 6.10.30.00 (also reproduced on older version) [root@host3 core]# /usr/bin/coreos-installer -V coreos-installer 0.16.1 [root@host3 core]# cat /etc/redhat-release CentOS Stream CoreOS release 4.13 I'll attach the full IPA ramdisk log but coreos-installer seems to be completing without failure 2023-03-28 09:23:17.313 1 INFO ironic_coreos_install [-] Executing CoreOS installer: ['chroot', '/mnt/coreos', 'coreos-installer', 'install', '--preserve-on-error', '--ignition-file', '/tmp/ironic.ign', '--offline', '--append-karg', 'ip=dhcp', '/dev/sda'] 2023-03-28 09:23:17.328 1 DEBUG ironic_coreos_install [-] coreos-installer: Installing CentOS Stream CoreOS 413.92.202303011445-0 (Plow) x86_64 (512-byte sectors) _run_install /usr/lib/python3.9/site-packages/ironic_coreos_install.py:193 ... 2023-03-28 09:23:45.121 1 DEBUG ironic_coreos_install [-] coreos-installer: Read disk 3.5 GiB/3.5 GiB (100%) _run_install /usr/lib/python3.9/site-packages/ironic_coreos_install.py:193 2023-03-28 09:23:45.121 1 DEBUG ironic_coreos_install [-] coreos-installer: Read disk 3.5 GiB/3.5 GiB (100%) _run_install /usr/lib/python3.9/site-packages/ironic_coreos_install.py:193 2023-03-28 09:23:46.389 1 DEBUG ironic_coreos_install [-] coreos-installer: Writing Ignition config _run_install /usr/lib/python3.9/site-packages/ironic_coreos_install.py:193 2023-03-28 09:23:46.389 1 DEBUG ironic_coreos_install [-] coreos-installer: Modifying kernel arguments _run_install /usr/lib/python3.9/site-packages/ironic_coreos_install.py:193 2023-03-28 09:23:46.822 1 DEBUG ironic_coreos_install [-] coreos-installer: Install complete. _run_install /usr/lib/python3.9/site-packages/ironic_coreos_install.py:193