-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.20.z
-
None
-
False
-
-
None
-
Moderate
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Customer is trying to deploy OCP 4.20.8[specifically] on Power using UPI and network boot since a long time now. Until recently 4.20 was working fine, but apparently since 4.20.8 the deployment fails all the time.
We can see 4.20.6 was still fine. We do not see a 4.20.7 version that was deployed.
Cu tried almost 8 times.
~~~
cmd:
- /root/install/openshift-install
- wait-for
- bootstrap-complete
- --dir
- /root/install
- --log-level
- debug
delta: '0:20:11.243963'
end: '2025-12-16 18:38:35.347608'
msg: non-zero return code
rc: 5
start: '2025-12-16 18:18:24.103645'
stderr: |-
level=debug msg=OpenShift Installer 4.20.8
level=debug msg=Built from commit cc82f30cd640577297f66b5df80f0e08c55fd3fa
level=info msg=Waiting up to 20m0s (until 6:38PM EST) for the Kubernetes API at https://api.p1313.cecc.ihost.com:6443...
level=debug msg=Loading Agent Config...
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=error msg=Attempted to gather ClusterOperator status after wait failure: listing ClusterOperator objects: Get "https://api.p1313.cecc.ihost.com:6443/apis/config.openshift.io/v1/clusteroperators": EOF
level=info msg=Use the following commands to gather logs from the cluster
level=info msg=openshift-install gather bootstrap --help
level=error msg=Bootstrap failed to complete: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=error msg=Failed waiting for Kubernetes API. This error usually happens when there is a problem on the bootstrap host that prevents creating a temporary control plane.
stderr_lines:
stdout: ''
~~~
We can see from OCP node as below:
~~~
[root@p1390-master ~]# journalctl -b -f -u node-image-pull.service
Dec 18 15:33:31 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[1999]: Failed to fetch release image; retrying...
Dec 18 15:33:42 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[4307]: layers already present: 45; layers needed: 8 (373.6 MB)
Dec 18 15:33:42 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[4307]: error: Importing: Unencapsulating base: Layer sha256:a1a95042c79ebdea459dd626fe0cd7b99e309c81e483be09e307a4714a08cd1e: Importing objects: Importing object 0b/c9515b64a0c9f95a12f3c1ee2fe41eb72e78e55e1261fc344ef03e9f32e80e.file: Processing content object 0bc9515b64a0c9f95a12f3c1ee2fe41eb72e78e55e1261fc344ef03e9f32e80e: Importing regfile small: Writing content object: min-free-space-percent '3%' would be exceeded, at least 65.5 kB requested
Dec 18 15:33:42 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[1999]: Failed to fetch release image; retrying...
Dec 18 15:33:53 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[4354]: layers already present: 45; layers needed: 8 (373.6 MB)
Dec 18 15:33:53 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[4354]: error: Importing: Unencapsulating base: Layer sha256:a1a95042c79ebdea459dd626fe0cd7b99e309c81e483be09e307a4714a08cd1e: Importing objects: Importing object 0b/c9515b64a0c9f95a12f3c1ee2fe41eb72e78e55e1261fc344ef03e9f32e80e.file: Processing content object 0bc9515b64a0c9f95a12f3c1ee2fe41eb72e78e55e1261fc344ef03e9f32e80e: Importing regfile small: Writing content object: min-free-space-percent '3%' would be exceeded, at least 65.5 kB requested
~~~
We noticed further that the partitions created by installer are too small to handle:
~~~
[root@p1390-master ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 64G 128K 64G 1% /dev/shm
tmpfs 26G 52M 26G 1% /run
tmpfs 64G 697M 64G 2% /run/ephemeral_base
/dev/loop0 64G 1.1G 63G 2% /run/ephemeral
/dev/loop1 1.1G 1.1G 0 100% /sysroot <<<<<<<<<<<<<<<<<<<<<
tmpfs 64G 0 64G 0% /tmp
tmpfs 4.0G 3.9G 123M 98% /var/ostree-container <<<<<<<<<<<<<<<
tmpfs 13G 0 13G 0% /run/user/1000
~~~
Then they tried resizing the FS in bootstrap:
[root@p1382-master ~]# mount -o remount,size=8G /var/ostree-container
right after the download image seemed to be happy
~~~
Dec 19 17:47:53 p1382-master.p1382.cecc.ihost.com node-image-pull.sh[7863]: Wrote: ostree-unverified-registry:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ea3cee12018835ed840252172334dd28e01f43b8630831e45c05893d688815a0 => f7f7813f8b2306ff62cfba2eb3de7d95652b55cb842454827e9d90163703ddc4
Dec 19 17:47:54 p1382-master.p1382.cecc.ihost.com node-image-pull.sh[1540]: Checking out node image content
Dec 19 17:47:55 p1382-master.p1382.cecc.ihost.com systemd[1]: Finished Node Image Pull.
Dec 19 17:47:55 p1382-master.p1382.cecc.ihost.com systemd[1]: node-image-pull.service: Deactivated successfully.
Dec 19 17:47:55 p1382-master.p1382.cecc.ihost.com systemd[1]: Stopped Node Image Pull.
Dec 19 17:47:55 p1382-master.p1382.cecc.ihost.com systemd[1]: node-image-pull.service: Consumed 1min 55.366s CPU time.
~~~
and the bootstrap process seems to started.
- is caused by
-
OCPBUGS-66231 ABI vSphere Installation Failing Due to “No Space Left on Device”
-
- Verified
-
- relates to
-
OCPBUGS-62790 ABI vSphere Installation Failing Due to “No Space Left on Device”
-
- Verified
-