Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-70316

[IBMCloud] Bootstrap failed to complete with some install configuration

    • None
    • False
    • Hide

      None

      Show
      None
    • None
    • None
    • Yes
    • None
    • Rejected
    • None
    • None
    • None
    • None
    • None
    • None
    • None
    • None

      Description of problem:

      bootstrap failed with some install configuration

      Version-Release number of selected component (if applicable):

       4.21.0-0.nightly-2026-01-02-092922   

      How reproducible:

       Always (with the same configuration)   

      Steps to Reproduce:

          1. IPI install with the default vm type, the region is us-east
          2.
          3.
          

      Actual results:

      install failed, in the .openshift_install.log
      time="2026-01-05T06:48:21Z" level=debug msg="Still waiting for the Kubernetes API: Get \"https://api.maxu-49512.ibmcloud.qe.devcluster.openshift.com:6443/version\": dial tcp: lookup api.maxu-49512.ibmcloud.qe.devcluster.openshift.com on 172.31.0.10:53: no such host"
      time="2026-01-05T06:48:36Z" level=debug msg="I0105 06:48:36.302769    5272 cluster_accessor.go:252] \"Connecting\" controller=\"clustercache\" controllerGroup=\"cluster.x-k8s.io\" controllerKind=\"Cluster\" Cluster=\"openshift-cluster-api-guests/maxu-49512-2w4l5\" namespace=\"openshift-cluster-api-guests\" name=\"maxu-49512-2w4l5\" reconcileID=\"3669809d-37fd-4ad0-80c6-4a28254ef533\""
      time="2026-01-05T06:48:36Z" level=debug msg="E0105 06:48:36.309125    5272 cluster_accessor.go:262] \"Connect failed\" err=\"error creating HTTP client and mapper: cluster is not reachable: Get \\\"https://api.maxu-49512.ibmcloud.qe.devcluster.openshift.com:6443/?timeout=5s\\\": dial tcp: lookup api.maxu-49512.ibmcloud.qe.devcluster.openshift.com on 172.31.0.10:53: no such host\" controller=\"clustercache\" controllerGroup=\"cluster.x-k8s.io\" controllerKind=\"Cluster\" Cluster=\"openshift-cluster-api-guests/maxu-49512-2w4l5\" namespace=\"openshift-cluster-api-guests\" name=\"maxu-49512-2w4l5\" reconcileID=\"3669809d-37fd-4ad0-80c6-4a28254ef533\""
      time="2026-01-05T06:48:51Z" level=error msg="Attempted to gather ClusterOperator status after installation failure: listing ClusterOperator objects: Get \"https://api.maxu-49512.ibmcloud.qe.devcluster.openshift.com:6443/apis/config.openshift.io/v1/clusteroperators\": dial tcp: lookup api.maxu-49512.ibmcloud.qe.devcluster.openshift.com on 172.31.0.10:53: no such host"
      time="2026-01-05T06:48:51Z" level=error msg="Bootstrap failed to complete: Get \"https://api.maxu-49512.ibmcloud.qe.devcluster.openshift.com:6443/version\": dial tcp: lookup api.maxu-49512.ibmcloud.qe.devcluster.openshift.com on 172.31.0.10:53: no such host"
      time="2026-01-05T06:48:51Z" level=error msg="Failed waiting for Kubernetes API. This error usually happens when there is a problem on the bootstrap host that prevents creating a temporary control plane."
      
      checked in the bootstrap vm 
      $ ssh -i ~/openshift-qe.pem core@150.239.115.94
      This is the bootstrap node; it will be destroyed when the master is fully up.
      
      The primary services are release-image.service followed by bootkube.service and node-image-pull.service. To watch their status, run e.g.
      
        journalctl -b -f -u release-image.service -u bootkube.service -u node-image-pull.service
      Last login: Mon Jan  5 08:48:33 2026 from 66.187.232.129
      [systemd]
      Failed Units: 1
        node-image-finish.service
      
      # systemctl status node-image-finish.service
      × node-image-finish.service - Node Image Finish
           Loaded: loaded (/etc/systemd/system/node-image-finish.service; static)
           Active: failed (Result: signal) since Mon 2026-01-05 08:02:04 UTC; 1h 19min ago
          Process: 3069 ExecStart=/usr/bin/echo Node image overlay complete; switching back to multi-user.target (code=exited, status=0/SUCCESS)
          Process: 3070 ExecStart=/usr/bin/systemctl --no-block isolate multi-user.target (code=killed, signal=TERM)
         Main PID: 3070 (code=killed, signal=TERM)
              CPU: 11ms
      
      Jan 05 08:02:03 maxu-cxf-mxf-mon-2-s8td5-bootstrap systemd[1]: Starting Node Image Finish...
      Jan 05 08:02:03 maxu-cxf-mxf-mon-2-s8td5-bootstrap echo[3069]: Node image overlay complete; switching back to multi-user.target
      Jan 05 08:02:04 maxu-cxf-mxf-mon-2-s8td5-bootstrap systemd[1]: node-image-finish.service: Main process exited, code=killed, status=15/TERM
      Jan 05 08:02:04 maxu-cxf-mxf-mon-2-s8td5-bootstrap systemd[1]: node-image-finish.service: Failed with result 'signal'.
      Jan 05 08:02:04 maxu-cxf-mxf-mon-2-s8td5-bootstrap systemd[1]: Stopped Node Image Finish.

      Expected results:

          install succeed, bootstrap vm is destroyed

      Additional info:

      1. the cluster install succeed, "oc get co" PASS.
      2. region us-east us-south ca-tor ca-mon eu-de has the same issue.
      
      test on ca-mon has the same issue
      apiVersion: v1
      controlPlane:
        architecture: amd64
        hyperthreading: Enabled
        name: master
        platform:
          ibmcloud:
            type: bxf-8x32
        replicas: 3
      compute:
      - architecture: amd64
        hyperthreading: Enabled
        name: worker
        platform:
          ibmcloud:
            type: bxf-4x16
        replicas: 3
      metadata:
        name: maxu-ca-mon
      platform:
        ibmcloud:
          region: ca-mon
      

              zszepesi Zoltan Szepesi
              maxu@redhat.com May Xu
              Angela Fogarolli, Dennis Gilmore, Zoltan Szepesi
              None
              May Xu May Xu
              None
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated: