-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.18.z
-
Quality / Stability / Reliability
-
False
-
-
3
-
3
-
Critical
-
None
-
None
-
None
-
None
-
Metal Platform 274, Metal Platform 275, Metal Platform 276
-
3
-
In Progress
-
Release Note Not Required
-
None
-
None
-
None
-
None
-
None
Description of problem:
I have the following BMH definition:
apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: dell-server namespace: hardware-inventory annotations: spec: automatedCleaningMode: disabled bmc: disableCertificateVerification: True address: idrac-virtualmedia://10.6.75.116/redfish/v1/Systems/System.Embedded.1 credentialsName: dell-bmc-credentials bootMACAddress: B4:83:51:00:B4:88 online: true
And I apply the following manifest to run the upgrade:
--- apiVersion: metal3.io/v1alpha1 kind: HostUpdatePolicy metadata: name: dell-server namespace: hardware-inventory spec: firmwareSettings: onReboot firmwareUpdates: onReboot --- apiVersion: metal3.io/v1alpha1 kind: HostFirmwareComponents metadata: name: dell-server namespace: hardware-inventory spec: updates: - component: bmc url: http://10.6.116.4:9999/iDRAC-with-Lifecycle-Controller_Firmware_R8V2F_LN64_7.20.30.00_A00.BIN - component: bios url: http://10.6.116.4:9999/BIOS_0HY8N_LN64_1.17.2.BIN
Then I apply the reboot.metal3.io="" annotation to get the firmware and bios upgraded. The problem I see is that this upgrade fails, this is what I see in the BMO logs:
{"level":"info","ts":1750857141.5618482,"logger":"controllers.BareMetalHost","msg":"using PreprovisioningImage","baremetalhost":{"name":"dell-server","namespace":"hardware-inventory"},"provisioningState":"available","Image":{"ImageURL":"http://metal3-image-customization-service.openshift-machine-api.svc.cluster.local/d235e339-76f8-4709-ad13-3517db75f539","KernelURL":"","ExtraKernelParams":"","Format":"iso"}} {"level":"info","ts":1750857141.5847816,"logger":"provisioner.ironic","msg":"current provision state","host":"hardware-inventory~dell-server","lastError":"Firmware update failed for node c0471e4f-e9c8-4678-9893-77975de4ded2, firmware http://10.6.116.4:9999/iDRAC-with-Lifecycle-Controller_Firmware_R8V2F_LN64_7.20.30.00_A00.BIN. Error: Lifecycle Controller in use. This job will start when Lifecycle Controller is available.","current":"manageable","target":""}
Now, that's okay. The issue I see is that this error will be cleared from the BMH, this is the BMH during the upgrade:
NAME STATE CONSUMER ONLINE ERROR AGE dell-server preparing true 46h dell-server preparing true preparation error 46h dell-server preparing true preparation error 46h dell-server available true 46h
Version-Release number of selected component (if applicable):
OCP 4.18
How reproducible:
Always
Steps to Reproduce:
Described above.
Actual results:
Firmware upgrade fails. Error gets cleared out from the BMH. User cannot see why it failed unless watching the object during the procedure.
Expected results:
Firmware upgrade fails. Error remains in the BMH until the user fixes whatever needs to be fixed.
Additional info:
Slack thread: https://redhat-internal.slack.com/archives/CFP6ST0A3/p1750857275820059