-
Bug
-
Resolution: Cannot Reproduce
-
Normal
-
None
-
None
-
None
-
False
-
-
False
-
?
-
None
-
-
-
Moderate
Deployment on the unprovisioned data-plane gets stalled with error:
2025-01-13T17:45:09Z ERROR Reconciler error {"controller": "openstackbaremetalset", "controllerGroup": "baremetal.openstack.org", "controllerKind": "OpenStackBaremetalSet", "OpenStackBaremetalSet":
{"name":"openstack-data-plane","namespace":"openstack"}, "namespace": "openstack", "name": "openstack-data-plane", "reconcileID": "cf39a251-158c-46df-aa90-3ae1ec113058", "error": "unable to find 3 requested BaremetalHosts with labels [app:openstack] in namespace openshift-machine-api for scale-up (3 in use, 0 available)"}
Even though all 3 nodes are in the process of being deployed. It seems that openstack-baremetal-operator has some sort of check that prevents the nodes to continue provisioning
The issue might have been introduced sometime between 18.0.1 and 18.0.5, since the same nodes had successfully deploy on 18.0.1.
Also this seems to be RHOSO baremetal operator related issue since I am able to deploy the same nodes with just metal3 and BMH CRD:
apiVersion: metal3.io/v1alpha1
kind: BareMetalHost
metadata:
name: sb-blade-c3-10
namespace: openshift-machine-api
labels:
app: openstack
spec:
automatedCleaningMode: metadata
bmc:
address: ipmi://172.20.254.44
credentialsName: lenovo-blades-bmc-secret
disableCertificateVerification: true
bootMACAddress: '00:0E:1E:AA:DE:30'
bootMode: legacy
online: false
image:
checksum: 'http://172.20.129.19/edpm-hardened-uefi.qcow2.sha256'
checksumType: sha256
url: 'http://172.20.129.19/edpm-hardened-uefi.qcow2'
rootDeviceHints:
# serialNumber: "2224E63C00A7"
wwn: "0x500a0751e63c00a7"
preprovisioningNetworkDataName: lenovo-blades-nmstate-bond-ospcompute2-secret
The logs are being uploaded in here -> https://drive.google.com/drive/folders/1r6mcfTE67un6gt27oECcUPDgtOHtdD9H?usp=drive_link