-
Bug
-
Resolution: Cannot Reproduce
-
Normal
-
None
-
None
-
None
-
False
-
-
False
-
?
-
None
-
-
-
Moderate
Deployment on the unprovisioned data-plane gets stalled with error:
2025-01-13T17:45:09Z ERROR Reconciler error {"controller": "openstackbaremetalset", "controllerGroup": "baremetal.openstack.org", "controllerKind": "OpenStackBaremetalSet", "OpenStackBaremetalSet":
{"name":"openstack-data-plane","namespace":"openstack"}, "namespace": "openstack", "name": "openstack-data-plane", "reconcileID": "cf39a251-158c-46df-aa90-3ae1ec113058", "error": "unable to find 3 requested BaremetalHosts with labels [app:openstack] in namespace openshift-machine-api for scale-up (3 in use, 0 available)"}
Even though all 3 nodes are in the process of being deployed. It seems that openstack-baremetal-operator has some sort of check that prevents the nodes to continue provisioning
The issue might have been introduced sometime between 18.0.1 and 18.0.5, since the same nodes had successfully deploy on 18.0.1.
Also this seems to be RHOSO baremetal operator related issue since I am able to deploy the same nodes with just metal3 and BMH CRD:
apiVersion: metal3.io/v1alpha1 kind: BareMetalHost metadata: name: sb-blade-c3-10 namespace: openshift-machine-api labels: app: openstack spec: automatedCleaningMode: metadata bmc: address: ipmi://172.20.254.44 credentialsName: lenovo-blades-bmc-secret disableCertificateVerification: true bootMACAddress: '00:0E:1E:AA:DE:30' bootMode: legacy online: false image: checksum: 'http://172.20.129.19/edpm-hardened-uefi.qcow2.sha256' checksumType: sha256 url: 'http://172.20.129.19/edpm-hardened-uefi.qcow2' rootDeviceHints: # serialNumber: "2224E63C00A7" wwn: "0x500a0751e63c00a7" preprovisioningNetworkDataName: lenovo-blades-nmstate-bond-ospcompute2-secret
The logs are being uploaded in here -> https://drive.google.com/drive/folders/1r6mcfTE67un6gt27oECcUPDgtOHtdD9H?usp=drive_link