-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.20
-
Quality / Stability / Reliability
-
False
-
-
2
-
Critical
-
No
-
x86_64
-
None
-
None
-
Rejected
-
None
-
Proposed
-
Known Issue
-
-
None
-
None
-
None
-
None
Description of problem:
Follow on from OCPBUGS-62009.
I deleted the BMH and restarted the metal3 pods to recover.
I then reprovisioned the BMH and waited for inspection to finish.
I then updated the hostfirmwarecomponents to just update the BMH.
It actually did update the firmware on the server but metal3 failed to recover the server.
apiVersion: metal3.io/v1alpha1
kind: HostFirmwareComponents
metadata:
creationTimestamp: "2025-09-21T23:12:18Z"
generation: 2
name: r740xdg1
namespace: r740-pool
ownerReferences:
- apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
name: r740xdg1
uid: 9494c650-ad12-4a75-a304-3e9a91b90a1c
resourceVersion: "18510819"
uid: d1077d36-c7cd-4647-a4d1-8c7eab631e5e
spec:
updates:
- component: bmc
url: http://hv14.telco5gran.eng.rdu2.redhat.com:8888/firmware/r740/iDRAC-with-Lifecycle-Controller_Firmware_XTFXJ_WN64_7.00.00.173_A00.EXE
status:
components:
- component: bios
currentVersion: 2.22.2
initialVersion: 2.22.2
- component: bmc
currentVersion: 7.00.00.181
initialVersion: 7.00.00.181
- component: nic:NIC.Integrated.1
currentVersion: 14.32.20.04
initialVersion: 14.32.20.04
conditions:
- lastTransitionTime: "2025-09-21T23:12:19Z"
message: ""
observedGeneration: 2
reason: OK
status: "True"
type: Valid
- lastTransitionTime: "2025-09-21T23:26:16Z"
message: ""
observedGeneration: 2
reason: OK
status: "False"
type: ChangeDetected
lastUpdated: "2025-09-21T23:26:16Z"
updates:
- component: bmc
url: http://hv14.telco5gran.eng.rdu2.redhat.com:8888/firmware/r740/iDRAC-with-Lifecycle-Controller_Firmware_XTFXJ_WN64_7.00.00.173_A00.EXE
The ser ver is stuck in power off and is powered off again even if I manually power cycle it.
The following was seen in the logs
{"level":"info","ts":1758498122.6741993,"logger":"provisioner.ironic","msg":"current provision state","host":"r740-pool~r740xdg1","lastError":"Node ea1c7224-e159-4201-84b0-c28cad19046b failed step {'args': {'settings': [
{'component': 'bmc', 'url': 'http://hv14.telco5gran.eng.rdu2.redhat.com:8888/firmware/r740/iDRAC-with-Lifecycle-Controller_Firmware_XTFXJ_WN64_7.00.00.173_A00.EXE'}]}, 'interface': 'firmware', 'step': 'update', 'abortable': False, 'priority': 0}: Redfish exception occurred. Error: HTTP POST https://10.6.36.10/redfish/v1/SessionService/Sessions returned code 401. Base.1.12.GeneralError: Unable to complete the operation because an invalid username and/or password is entered, and therefore authentication failed. Extended information: [{'Message': 'Unable to complete the operation because an invalid username and/or password is entered, and therefore authentication failed.', 'MessageArgs': [], 'MessageArgs@odata.count': 0, 'MessageId': 'IDRAC.2.8.SYS415', 'RelatedProperties': [], 'RelatedProperties@odata.count': 0, 'Resolution': 'Enter valid user name and password and retry the operation.', 'Severity': 'Warning'}]","current":"manageable","target":""}
Credentials are obviously ok as we would have not gotten this far with invalid ones.
BMC logs
2025-09-22 00:06:51 LOG007 The previous log entry was repeated 1 times.
2025-09-21 23:41:18 RAC0720 Unable to locate the ISO or IMG image file or folder in the network share location because the file or folder path or the user credentials entered are incorrect.
2025-09-21 23:41:17 USR0030 Successfully logged in using telemetry, from 127.0.0.1 and REDFISH.
2025-09-21 23:41:15 RAC0717 Remote share unmounted successfully.
2025-09-21 23:40:56 USR0031 Unable to log in for NULL from 10.22.88.128 using eHTML5 Virtual Console.
2025-09-21 23:40:56 LOG007 The previous log entry was repeated 1 times.
2025-09-21 23:40:49 USR0030 Successfully logged in using root, from 10.8.53.44 and REDFISH.
2025-09-21 23:40:37 DIS002 Auto Discovery feature disabled.
2025-09-21 23:40:37 RAC0182 The iDRAC firmware was rebooted with the following reason: user initiated.
2025-09-21 23:40:27 IPA0100 The iDRAC IP Address changed from :: to 2620:52:9:1624:f602:70ff:fee4:f7f4.
2025-09-21 23:40:27 IPA0100 The iDRAC IP Address changed from 0.0.0.0 to 10.6.36.10.
2025-09-21 23:40:07 PR36 Version change detected for Lifecycle Controller firmware. Previous version:7.00.00.181, Current version:7.00.00.173
2025-09-21 23:40:02 THRM0008 The UNC Warning threshold limit of the server board inlet temperature sensor is changed to 38.
2025-09-21 23:39:58 PSU0800 Power Supply 2: Status = 0x1, IOUT = 0x0, VOUT= 0x0, TEMP= 0x0, FAN = 0x0, INPUT= 0x0.
2025-09-21 23:39:58 PSU0800 Power Supply 1: Status = 0x1, IOUT = 0x0, VOUT= 0x0, TEMP= 0x0, FAN = 0x0, INPUT= 0x0.
2025-09-21 23:37:38 USR0032 The session for root from 10.8.53.44 using REDFISH is logged off.
2025-09-21 23:37:36 SYS1001 System is turning off.
2025-09-21 23:37:36 SYS1003 System CPU Resetting.
2025-09-21 23:37:29 JCP037 The (installation or configuration) job JID_585155450820 is successfully completed.
2025-09-21 23:37:29 RED063 The iDRAC firmware updated successfully. Previous version: 7.00.00.181, Current version: 7.00.00.173
2025-09-21 23:37:29 RAC0704 Requested system powerdown.
2025-09-21 23:37:26 SUP1906 Firmware update successful.
2025-09-21 23:36:34 SUP1905 Firmware update programming flash.
2025-09-21 23:36:18 SUP1903 Firmware update verify image headers.
2025-09-21 23:36:18 SUP1904 Firmware update checksumming image.
2025-09-21 23:36:17 SUP1911 Firmware update initialization complete.
2025-09-21 23:36:17 SUP1901 Firmware update initializing.
2025-09-21 23:34:34 USR0032 The session for root from 10.8.53.44 using REDFISH is logged off.
2025-09-21 23:33:23 RED002 Package successfully downloaded.
2025-09-21 23:32:46 RED111 Successfully downloaded the update package details 228.097 MB in 16.9493 secs at 13.4576 MBps (107.661 Mbps) [iDRAC-with-Lifecycle-Controller_Firmware_XTFXJ_WN64_7.00.00.173_A00.EXE].
2025-09-21 23:32:29 RED110 Downloading the iDRAC-with-Lifecycle-Controller_Firmware_XTFXJ_WN64_7.00.00.173_A00.EXE update package.
2025-09-21 23:32:25 JCP027 The (installation or configuration) job JID_585155450820 is successfully created on iDRAC.
2025-09-21 23:29:01 CTL129 The boot media of the Controller RAID Controller in Slot 6 is Disk.Virtual.0:RAID.Slot.6-1.
2025-09-21 23:26:06 SYS1003 System CPU Resetting.
2025-09-21 23:25:43 SYS1000 System is turning on.
2025-09-21 23:25:42 RAC0701 Requested system powerup.
Must Gather: https://drive.google.com/file/d/1TlEg50A2NqIU8gYCBtwxg_MT1P6Jb8iu/view?usp=drive_link
if this is the same root cause as OCPBUGS-62009, feel free to mark as a dup.
Version-Release number of selected component (if applicable):
4.20-rc.2
How reproducible:
Have not had a BMC upgrade complete on rc2 yet
Steps to Reproduce:
- ...