-
Bug
-
Resolution: Done-Errata
-
Normal
-
rhos-18.0 FR 2 (Mar 2025)
-
None
-
3
-
False
-
-
False
-
?
-
openstack-ironic-21.4.5-18.0.20250519144814.9213ccd.el9ost
-
Impediment
-
rhos-ops-day1day2-hardprov
-
None
-
-
Bug Fix
-
Done
-
-
-
-
HardProv Sprint 4, HardProv Sprint 6, HardProv Sprint 7, HardProv Sprint 8
-
4
-
Important
While provisioning a BareMetalHosts we encounter occasional situations where the BMH boots into RHCOS and send healthchecks to the OpenStack control plane (successfully) after creating the "OpenStackDataPlaneNodeSet" resource.
Nevertheless, the metal3 operator keeps waiting indefinitely for the BMH to finish provisioning.
NAME STATE CONSUMER ONLINE ERROR AGE
baremetalhost.metal3.io/srv12d provisioning dataplane-nodeset true 3h16m
NAME STATUS MESSAGE
openstackbaremetalset.baremetal.openstack.org/dataplane-nodeset False OpenStackBaremetalSet BMH provisioning in progress
NAME STATUS MESSAGE
openstackdataplanenodeset.dataplane.openstack.org/dataplane-nodeset False Setup started
jkreger@redhat.com has done some initial debugging and found the following:
appears that we're hitting a weird edge case which is going to require us to revisit the logic deep inside that interaction, because what, at a high level appears to be happening is we sort of get derailed at the worst possible place due to something happening breaking connectivity wise. Why, I have no clue, but I suspect it could be a race condition or competing networking on the ramdisk.
More context: https://redhat-internal.slack.com/archives/C04HGQ5N51N/p1743084026742799
Two must-gather archives (of separate incidents) are attached to this ticket.
- links to
-
RHBA-2025:152056 Release of components for RHOSO 18.0