-
Bug
-
Resolution: Duplicate
-
Undefined
-
None
-
4.18
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Attempts to run day2 update on an HPE ProLiant DL110 Gen11, provisioned with assisted-service, fails in the final stages with the following error in the BMH:
errorMessage: 'Node 845a8264-7970-414e-a2b2-5b292bedee0d failed step {''args'':
{''settings'': [{''component'': ''bios'', ''url'': ''http://hv14.telco5gran.eng.rdu2.redhat.com:8888/firmware/hpe/BIOS-U62_1.48_10_05_2023.fwpkg''}]},
''interface'': ''firmware'', ''step'': ''update'', ''abortable'': False, ''priority'':
0}: Failed to set node power state to power on.'
errorType: servicing error
The BIOS firmware upgrade itself is successful, but the final reboot does not happen, reporting this error.
2025-05-12 16:50:50.743 1 DEBUG sushy.connector [-] HTTP request: GET https://10.6.36.56/redfish/v1/Systems/1; headers: {'OData-Version': '4.0', 'Accept-Encoding': 'identity'}; body: None; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.9/site-packages/sushy/connector.py:157
2025-05-12 16:50:51.054 1 DEBUG sushy.connector [-] HTTP response for GET https://10.6.36.56/redfish/v1/Systems/1: status code: 200 _op /usr/lib/python3.9/site-packages/sushy/connector.py:306 2025-05-12 16:50:51.055 1 DEBUG sushy.resources.base [-] Received representation of System /redfish/v1/Systems/1: {'_actions': {'reset': {'allowed_values': ['On', 'ForceOff', 'GracefulShutdown', 'ForceRestart', 'Nmi', 'PushPowerButton', 'GracefulRestart'], 'operation_apply_time_support': None, 'target_uri': '/redfish/v1/Systems/1/Actions/ComputerSystem.Reset'}}, '_oem_vendors': ['Hpe'], '_settings': None, 'asset_tag': '', 'bios_version': 'U62 v1.48 (10/05/2023)', 'boot': {'allowed_values': ['None', 'Cd', 'Hdd', 'Usb', 'SDCard', 'Utilities', 'Diags', 'BiosSetup', 'Pxe', 'UefiShell', 'UefiHttp', 'UefiTarget'], 'enabled': <BootSourceOverrideEnabled.ONCE: 'Once'>, 'http_boot_uri': None, 'mode': <BootSourceOverrideMode.UEFI: 'UEFI'>, 'target': <BootSource.CD: 'Cd'>}, 'boot_progress': {'last_boot_seconds_count': 0, 'last_state': <BootProgressStates.PRIMARY_PROCESSOR: 'PrimaryProcessorInitializationStarted'>, 'last_state_updated_at': None, 'oem_last_state': None}, 'description': None, 'hostname': 'cnfdg43', 'identity': '1', 'indicator_led': None, 'links': {'oem_vendors': None}, 'maintenance_window': None, 'manufacturer': 'HPE', 'memory_summary': {'health': 'OK', 'size_gib': 128}, 'name': 'Computer System', 'part_number': None, 'power_state': None, 'serial_number': '2M2D0V03V2', 'sku': 'P54277-B21', 'status': {'health': <Health.OK: 'OK'>, 'health_rollup': <Health.OK: 'OK'>, 'state': <State.STARTING: 'Starting'>}, 'system_type': <SystemType.PHYSICAL: 'Physical'>, 'uuid': '32343550-3737-4D32-3244-305630335632'} refresh /usr/lib/python3.9/site-packages/sushy/resources/base.py:697 2025-05-12 16:50:51.056 1 ERROR oslo.service.loopingcall [-] Dynamic backoff interval looping call 'ironic.conductor.utils.node_wait_for_power_state.<locals>._wait' failed: oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after 161.11 seconds
2025-05-12 16:50:51.056 1 ERROR oslo.service.loopingcall Traceback (most recent call last):
2025-05-12 16:50:51.056 1 ERROR oslo.service.loopingcall File "/usr/lib/python3.9/site-packages/oslo_service/loopingcall.py", line 154, in _run_loop
2025-05-12 16:50:51.056 1 ERROR oslo.service.loopingcall idle = idle_for_func(result, self._elapsed(watch))
2025-05-12 16:50:51.056 1 ERROR oslo.service.loopingcall File "/usr/lib/python3.9/site-packages/oslo_service/loopingcall.py", line 349, in _idle_for
2025-05-12 16:50:51.056 1 ERROR oslo.service.loopingcall raise LoopingCallTimeOut(
2025-05-12 16:50:51.056 1 ERROR oslo.service.loopingcall oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after 161.11 seconds
2025-05-12 16:50:51.056 1 ERROR oslo.service.loopingcall 2025-05-12 16:50:51.058 1 ERROR ironic.conductor.utils [None req-19544a4a-fbd0-46c2-bacf-53a516ca768e - - - - - -] Timed out after 120 secs waiting for power on on node 845a8264-7970-414e-a2b2-5b292bedee0d.: oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after 161.11 seconds 2025-05-12 16:50:51.072 1 ERROR ironic.drivers.utils [None req-19544a4a-fbd0-46c2-bacf-53a516ca768e - - - - - -] Failed to invoke collect_system_logs agent command for node 845a8264-7970-414e-a2b2-5b292bedee0d. Error: Agent driver requires agent_url in driver_internal_info: ironic.common.exception.AgentConnectionFailed: Agent driver requires agent_url in driver_internal_info
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils [None req-19544a4a-fbd0-46c2-bacf-53a516ca768e - - - - - -] Node 845a8264-7970-414e-a2b2-5b292bedee0d failed step {'args': {'settings': [{'component': 'bios', 'url': 'http://hv14.telco5gran.eng.rdu2.redhat.com:8888/firmware/hpe/BIOS-U62_1.48_10_05_2023.fwpkg'}]}, 'interface': 'firmware', 'step': 'update', 'abortable': False, 'priority': 0}: Failed to set node power state to power on.: ironic.common.exception.PowerStateFailure: Failed to set node power state to power on.
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils Traceback (most recent call last):
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/conductor/utils.py", line 184, in node_wait_for_power_state
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils return timer.start(initial_delay=1, timeout=retry_timeout).wait()
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/eventlet/event.py", line 125, in wait
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils result = hub.switch()
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/eventlet/hubs/hub.py", line 313, in switch
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils return self.greenlet.switch()
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/oslo_service/loopingcall.py", line 154, in _run_loop
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils idle = idle_for_func(result, self._elapsed(watch))
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/oslo_service/loopingcall.py", line 349, in _idle_for
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils raise LoopingCallTimeOut(
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after 161.11 seconds
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils During handling of the above exception, another exception occurred:
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils Traceback (most recent call last):
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/conductor/servicing.py", line 151, in do_next_service_step
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils result = interface.execute_service_step(task, step)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/base.py", line 456, in execute_service_step
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils return self._execute_step(task, step)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/base.py", line 319, in _execute_step
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils return getattr(self, step['step'])(task, **args)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic_lib/metrics.py", line 60, in wrapped
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils result = f(*args, **kwargs)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/base.py", line 1887, in wrapped
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils result = func(self, task, *args, **kwargs)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/redfish/firmware.py", line 184, in update
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils return deploy_utils.reboot_to_finish_step(task, timeout=wait_interval)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/deploy_utils.py", line 1661, in reboot_to_finish_step
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils manager_utils.node_power_action(task, states.REBOOT, timeout)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/conductor/task_manager.py", line 153, in wrapper
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils return f(*args, **kwargs)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/conductor/utils.py", line 356, in node_power_action
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils notify_utils.emit_power_set_notification(
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils self.force_reraise()
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils raise self.value
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/conductor/utils.py", line 344, in node_power_action
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils task.driver.power.reboot(task, timeout=timeout)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/conductor/task_manager.py", line 153, in wrapper
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils return f(*args, **kwargs)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/redfish/power.py", line 163, in reboot
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils _set_power_state(task, system, next_state, timeout=timeout)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/redfish/power.py", line 71, in _set_power_state
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils cond_utils.node_wait_for_power_state(task, target_state,
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils File "/usr/lib/python3.9/site-packages/ironic/conductor/utils.py", line 190, in node_wait_for_power_state
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils raise exception.PowerStateFailure(pstate=new_state)
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils ironic.common.exception.PowerStateFailure: Failed to set node power state to power on.
2025-05-12 16:50:51.080 1 ERROR ironic.conductor.utils
2025-05-12 16:50:51.081 1 DEBUG ironic.conductor.utils [None req-19544a4a-fbd0-46c2-bacf-53a516ca768e - - - - - -] Node 845a8264-7970-414e-a2b2-5b292bedee0d is not fast-track-able because it has an error: Failed to change power state to 'power on' by 'rebooting': Failed to set node power state to power on. is_fast_track /usr/lib/python3.9/site-packages/ironic/conductor/utils.py:1221 2025-05-12 16:50:51.081 1 DEBUG ironic.drivers.modules.redfish.boot [None req-19544a4a-fbd0-46c2-bacf-53a516ca768e - - - - - -] Cleaning up deploy boot for 845a8264-7970-414e-a2b2-5b292bedee0d clean_up_ramdisk /usr/lib/python3.9/site-packages/ironic/drivers/modules/redfish/boot.py:813
Version-Release number of selected component (if applicable):
OCP: 4.18.10
MCE: 2.8.1
HPE ILO: iLO 6 v1.68 / Apr 24 2025
How reproducible:
The HPE BIOS firmware upgrade seems to succeed prior to provisioning, but once provisioned and the preprovisioningImage is provided by assisted-service, this failure occurs almost every time.
Steps to Reproduce:
1. Custom build of assisted-service is required with support for day2-config annotation, in order to drop the detached/paused annotations
2. Provision cluster with o-cloud manager / siteconfig / assisted-service
3. Run day2 firmware update procedure
Actual results:
Expected results:
Additional info: