Uploaded image for project: 'Red Hat OpenStack Services on OpenShift'
  1. Red Hat OpenStack Services on OpenShift
  2. OSPRH-8832

Incorrect provisioning endpoint in BMH when Redfish is used

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • ?
    • ?
    • ?
    • ?
    • None
    • Important

      When the Redish protocol (e.g redfish://<host>/redfish/v1/Systems/System.Embedded.1) is used to provision the BMH, the provisioning endpoint created by default has the control plane IP instead the provisioning IP by default. This makes the provision to fail (it keeps the provisioning state forever) because of a timeout:

      [zuul@controller-0 ~]$ oc -n openshift-machine-api describe bmh compute-0
      ...
      Events:                                                                                                                                               Type    Reason               Age   From                         Message   ----    ------               ----  ----                         -------     Normal  ProvisioningStarted  22m   metal3-baremetal-controller  Image provisioning started for http://192.168.201.11:6190/edpm-hardened-uefi.qcow2
        Normal  ProvisioningError    117s  metal3-baremetal-controller  Image provisioning failed: Deploy step deploy.write_image failed on node 28c0630a-97a4-4cfb-9ca6-0b18196d571d. HTTPConnectionPool(host='192.168.201.11', port=6190): Max retries exceeded with url: /edpm-hardened-uefi.qcow2.sha256 (Caused by ConnectTimeo
      utError(<urllib3.connection.HTTPConnection object at 0x7efd31824760>, 'Connection to 192.168.201.11 timed out. (connect timeout=60)'))

      In order to fix the issue we need to delete the metal3-baremetal-operator (oc -n openshift-machine-api delete deployment.apps/metal3 deployment.apps/metal3-baremetal-operator deployment.apps/metal3-image-customization) and after the pods are recreated we can see this in the logs:

      [zuul@controller-0 ~]$ oc -n openshift-machine-api logs pod/metal3-baremetal-operator-56f9b66cd4-zltfp |grep qcow2
      {"level":"info","ts":1721634342.795684,"logger":"provisioner.ironic","msg":"adding option data","host":"openshift-machine-api~compute-1","option":"image_source","section":"instance_info","value":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2"}
      {"level":"info","ts":1721634342.7957306,"logger":"provisioner.ironic","msg":"adding option data","host":"openshift-machine-api~compute-1","option":"image_os_hash_value","section":"instance_info","value":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2.sha256"}
      {"level":"info","ts":1721634343.0262623,"logger":"provisioner.ironic","msg":"adding option data","host":"openshift-machine-api~compute-0","option":"image_os_hash_value","section":"instance_info","value":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2.sha256"}
      {"level":"info","ts":1721634343.0268526,"logger":"provisioner.ironic","msg":"adding option data","host":"openshift-machine-api~compute-0","option":"image_source","section":"instance_info","value":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2"}
      {"level":"info","ts":1721634343.1307952,"logger":"provisioner.ironic","msg":"checking image settings","host":"openshift-machine-api~compute-1","source":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2","checksumType":"sha256","checksum":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2.sha256","same":true,"provisionState":"manageable","iinfo":{"capabilities":{},"image_os_hash_algo":"sha256","image_os_hash_value":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2.sha256","image_source":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2"}}
      {"level":"info","ts":1721634343.2327936,"logger":"provisioner.ironic","msg":"checking image settings","host":"openshift-machine-api~compute-0","source":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2","checksumType":"sha256","checksum":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2.sha256","same":true,"provisionState":"manageable","iinfo":{"capabilities":{},"image_os_hash_algo":"sha256","image_os_hash_value":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2.sha256","image_source":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2"}}
      {"level":"info","ts":1721634353.344979,"logger":"provisioner.ironic","msg":"checking image settings","host":"openshift-machine-api~compute-1","source":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2","checksumType":"sha256","checksum":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2.sha256","same":true,"provisionState":"cleaning","iinfo":{"capabilities":{},"image_os_hash_algo":"sha256","image_os_hash_value":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2.sha256","image_source":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2"}}
      {"level":"info","ts":1721634353.5102594,"logger":"provisioner.ironic","msg":"checking image settings","host":"openshift-machine-api~compute-0","source":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2","checksumType":"sha256","checksum":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2.sha256","same":true,"provisionState":"cleaning","iinfo":{"capabilities":{},"image_os_hash_algo":"sha256","image_os_hash_value":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2.sha256","image_source":"http://172.23.0.3:6190/edpm-hardened-uefi.qcow2"}}

      And finally we got to have the baremetal host provisioned:

      [zuul@controller-0 ~]$ oc -n openshift-machine-api describe bmh compute-0
      ...
      Events:
        Type    Reason                Age   From                         Message
        ----    ------                ----  ----                         -------
        Normal  ProvisioningStarted   52m   metal3-baremetal-controller  Image provisioning started for http://192.168.201.11:6190/edpm-hardened-uefi.qcow2
        Normal  ProvisioningError     31m   metal3-baremetal-controller  Image provisioning failed: Deploy step deploy.write_image failed on node 28c0630a-97a4-4cfb-9ca6-0b18196d571d. HTTPConnectionPool(host='192.168.201.11', port=6190): Max retries exceeded with url: /edpm-hardened-uefi.qcow2.sha256 (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x7efd31824760>, 'Connection to 192.168.201.11 timed out. (connect timeout=60)'))
        Normal  Registered            29m   metal3-baremetal-controller  Registered new host
        Normal  ProvisioningStarted   18m   metal3-baremetal-controller  Image provisioning started for http://172.23.0.3:6190/edpm-hardened-uefi.qcow2
        Normal  ProvisioningComplete  16m   metal3-baremetal-controller  Image provisioning completed for http://172.23.0.3:6190/edpm-hardened-uefi.qcow2
      
      [zuul@controller-0 ~]$ oc get bmh -A
      NAMESPACE               NAME                 STATE         CONSUMER             ONLINE   ERROR                AGE
      openshift-machine-api   compute-0            provisioned   openstack-edpm       true                          23h
      openshift-machine-api   compute-1            provisioned   openstack-edpm       true                          23h
      openshift-machine-api   openshift-master-0   provisioned   ocp-vxx76-master-0   true     registration error   2d
      openshift-machine-api   openshift-master-1   provisioned   ocp-vxx76-master-1   true     registration error   2d
      openshift-machine-api   openshift-master-2   provisioned   ocp-vxx76-master-2   true     registration error   2d

       

            Unassigned Unassigned
            rdiazcam@redhat.com Ricardo Diaz Campos
            rhos-dfg-df
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: