-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.19.z, 4.20.z, 4.21.z
-
None
-
False
-
-
None
-
Moderate
-
Yes
-
ppc64le
-
In Progress
-
Bug Fix
-
The tmpfs volume used to store the ostree image on the bootstrap node is too small to contain the whole image for the ppc64le architecture. The tmpfs size has been increased to support the architecture.
-
None
-
None
-
None
-
None
This is a clone of issue OCPBUGS-70168. The following is the description of the original issue:
—
Customer is trying to deploy OCP 4.20.8[specifically] on Power using UPI and network boot since a long time now. Until recently 4.20 was working fine, but apparently since 4.20.8 the deployment fails all the time.
We can see 4.20.6 was still fine. We do not see a 4.20.7 version that was deployed.
Cu tried almost 8 times.
~~~
cmd:
- /root/install/openshift-install
- wait-for
- bootstrap-complete
- --dir
- /root/install
- --log-level
- debug
delta: '0:20:11.243963'
end: '2025-12-16 18:38:35.347608'
msg: non-zero return code
rc: 5
start: '2025-12-16 18:18:24.103645'
stderr: |-
level=debug msg=OpenShift Installer 4.20.8
level=debug msg=Built from commit cc82f30cd640577297f66b5df80f0e08c55fd3fa
level=info msg=Waiting up to 20m0s (until 6:38PM EST) for the Kubernetes API at https://api.p1313.cecc.ihost.com:6443...
level=debug msg=Loading Agent Config...
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=debug msg=Still waiting for the Kubernetes API: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=error msg=Attempted to gather ClusterOperator status after wait failure: listing ClusterOperator objects: Get "https://api.p1313.cecc.ihost.com:6443/apis/config.openshift.io/v1/clusteroperators": EOF
level=info msg=Use the following commands to gather logs from the cluster
level=info msg=openshift-install gather bootstrap --help
level=error msg=Bootstrap failed to complete: Get "https://api.p1313.cecc.ihost.com:6443/version": EOF
level=error msg=Failed waiting for Kubernetes API. This error usually happens when there is a problem on the bootstrap host that prevents creating a temporary control plane.
stderr_lines:
stdout: ''
~~~
We can see from OCP node as below:
~~~
[root@p1390-master ~]# journalctl -b -f -u node-image-pull.service
Dec 18 15:33:31 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[1999]: Failed to fetch release image; retrying...
Dec 18 15:33:42 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[4307]: layers already present: 45; layers needed: 8 (373.6 MB)
Dec 18 15:33:42 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[4307]: error: Importing: Unencapsulating base: Layer sha256:a1a95042c79ebdea459dd626fe0cd7b99e309c81e483be09e307a4714a08cd1e: Importing objects: Importing object 0b/c9515b64a0c9f95a12f3c1ee2fe41eb72e78e55e1261fc344ef03e9f32e80e.file: Processing content object 0bc9515b64a0c9f95a12f3c1ee2fe41eb72e78e55e1261fc344ef03e9f32e80e: Importing regfile small: Writing content object: min-free-space-percent '3%' would be exceeded, at least 65.5 kB requested
Dec 18 15:33:42 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[1999]: Failed to fetch release image; retrying...
Dec 18 15:33:53 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[4354]: layers already present: 45; layers needed: 8 (373.6 MB)
Dec 18 15:33:53 p1390-master.p1390.cecc.ihost.com node-image-pull.sh[4354]: error: Importing: Unencapsulating base: Layer sha256:a1a95042c79ebdea459dd626fe0cd7b99e309c81e483be09e307a4714a08cd1e: Importing objects: Importing object 0b/c9515b64a0c9f95a12f3c1ee2fe41eb72e78e55e1261fc344ef03e9f32e80e.file: Processing content object 0bc9515b64a0c9f95a12f3c1ee2fe41eb72e78e55e1261fc344ef03e9f32e80e: Importing regfile small: Writing content object: min-free-space-percent '3%' would be exceeded, at least 65.5 kB requested
~~~
We noticed further that the partitions created by installer are too small to handle:
~~~
[root@p1390-master ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 4.0M 0 4.0M 0% /dev
tmpfs 64G 128K 64G 1% /dev/shm
tmpfs 26G 52M 26G 1% /run
tmpfs 64G 697M 64G 2% /run/ephemeral_base
/dev/loop0 64G 1.1G 63G 2% /run/ephemeral
/dev/loop1 1.1G 1.1G 0 100% /sysroot <<<<<<<<<<<<<<<<<<<<<
tmpfs 64G 0 64G 0% /tmp
tmpfs 4.0G 3.9G 123M 98% /var/ostree-container <<<<<<<<<<<<<<<
tmpfs 13G 0 13G 0% /run/user/1000
~~~
Then they tried resizing the FS in bootstrap:
[root@p1382-master ~]# mount -o remount,size=8G /var/ostree-container
right after the download image seemed to be happy
~~~
Dec 19 17:47:53 p1382-master.p1382.cecc.ihost.com node-image-pull.sh[7863]: Wrote: ostree-unverified-registry:quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:ea3cee12018835ed840252172334dd28e01f43b8630831e45c05893d688815a0 => f7f7813f8b2306ff62cfba2eb3de7d95652b55cb842454827e9d90163703ddc4
Dec 19 17:47:54 p1382-master.p1382.cecc.ihost.com node-image-pull.sh[1540]: Checking out node image content
Dec 19 17:47:55 p1382-master.p1382.cecc.ihost.com systemd[1]: Finished Node Image Pull.
Dec 19 17:47:55 p1382-master.p1382.cecc.ihost.com systemd[1]: node-image-pull.service: Deactivated successfully.
Dec 19 17:47:55 p1382-master.p1382.cecc.ihost.com systemd[1]: Stopped Node Image Pull.
Dec 19 17:47:55 p1382-master.p1382.cecc.ihost.com systemd[1]: node-image-pull.service: Consumed 1min 55.366s CPU time.
~~~
and the bootstrap process seems to started.
- clones
-
OCPBUGS-70168 node-image-pull.service is failing to fetch release image, resulting installation failure
-
- Verified
-
- is blocked by
-
OCPBUGS-70168 node-image-pull.service is failing to fetch release image, resulting installation failure
-
- Verified
-
- links to