-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
4.20.z
Description
Attempting to scale in a worker node using the ACM 2.15 documented procedure results in the BareMetalHost (BMH) entering the deleting state and failing to progress. Ironic reports a provisioning/cleaning error:
Failed to prepare node <UUID> for cleaning: Validation of image href file:///templates/uefi_esp.img failed, reason: Specified image file not found.
The node is successfully deleted from the spoke cluster (verified via oc get nodes), but the corresponding BMH on the hub remains in deleting with provisioning error for an extended period before eventually removing only after several failed retries.
This indicates the Ironic cleaning operation expects an EFI template file (uefi_esp.img) that is missing from the container or environment.
Environment / Version
- ACM: 2.15.0
- Platform: Bare metal (Dell servers — hostnames redacted)
Steps to Reproduce
Follow the ACM documentation for worker node scale-in:
https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.15/html/multicluster_engine_operator_with_red_hat_advanced_cluster_management/ibio-intro#scale-in-worker-nodes
Mark a worker node for scale-in.
Observe that the node is removed from the spoke cluster.
Check the BareMetalHost object on the hub cluster — BMH transitions to deleting but remains stuck.
Inspect BMO / Ironic logs and see repeated cleaning failures referencing file:///templates/uefi_esp.img not found.
Actual Results
BMH remains stuck in deleting with provisioning error.
Ironic repeatedly fails cleaning due to missing uefi_esp.img.
Multiple attempts to set maintenance mode and retry, but deletion does not proceed normally.
Example Ironic logs:
Failed to prepare node <UUID> for cleaning:
Validation of image href file:///templates/uefi_esp.img failed,
reason: Specified image file not found.
Expected Results
Worker node should scale in cleanly.
Ironic cleaning should complete without missing template errors.
BMH should move from deleting → removed without manual intervention or long delays.
Business Impact
Delaying customer’s 4.20 rollout due to inability to reliably perform node lifecycle operations.
Additional Information
Node successfully disappears from the spoke cluster’s node list, so the failure is specific to the Ironic/BMH cleanup path.
The missing EFI template suggests a packaging issue, regression in cleaning workflows, or an incorrect assumption about file layout inside the relevant container image.
Full logs available upon request (hostnames redacted for security).
- is cloned by
-
OCPBUGS-67294 ACM 2.15.0: Worker node scale-in fails: BareMetalHost stuck in deleting due to Ironic cleaning error: missing uefi_esp.img
-
- New
-
- is depended on by
-
OCPBUGS-67294 ACM 2.15.0: Worker node scale-in fails: BareMetalHost stuck in deleting due to Ironic cleaning error: missing uefi_esp.img
-
- New
-
- links to