-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.19.0
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
Proposed
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Deployment of OCP 4.19 cluster using Assisted installer is failing with "bootkube.sh: line 86: oc: command not found" error, when dug deeper observed in the node layer the services node-image-overlay and node-image-pull are in inactive state.
Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux CoreOS 9.6.20250121-0
How reproducible:
Always
Steps to Reproduce:
1. Start Assisted cluster deployment using this repo (https://github.com/cs-zhang/ocp4-ai-powervm.git) and the following var file
# cat vars.yaml
---
# disk: /dev/sda
helper:
name: "helper"
ipaddr: "9.114.96.246"
#networkifacename: "env34"
dns:
domain: "ai.qa"
clusterid: "rdr-suraj-ai-dry4"
forwarder1: "9.9.9.9"
forwarder2: "8.8.4.4"
dhcp:
router: "9.114.96.1"
netmask: "255.255.252.0"
subnet: "9.114.96.0/22"
masters:
- name: "master-1"
ipaddr: "9.114.97.31"
macaddr: "fa:3b:2d:34:88:20"
pvmcec: C340F2U01-ZZ
pvmlpar: rdr-suraj-abi-dced8481-00015c34
disk: /dev/sda
- name: "master-2"
ipaddr: "9.114.97.25"
macaddr: "fa:0f:b4:68:25:20"
pvmcec: C340F2U01-ZZ
pvmlpar: rdr-suraj-abi-457dbb21-00015c37
disk: /dev/sda
- name: "master-3"
ipaddr: "9.114.96.249"
macaddr: "fa:0e:52:d3:f2:20"
pvmcec: C340F2U01-ZZ
pvmlpar: rdr-suraj-abi-6c66908c-00015c3a
disk: /dev/sda
workers:
- name: "worker-1"
ipaddr: "9.114.97.224"
macaddr: "fa:db:f4:ec:b1:20"
pvmcec: C340F2U01-ZZ
pvmlpar: rdr-suraj-abi-e965cc23-00015c3d
disk: /dev/sda
- name: "worker-2"
ipaddr: "9.114.97.229"
macaddr: "fa:dd:d0:b5:b9:20"
pvmcec: C340F2U01-ZZ
pvmlpar: rdr-suraj-abi-552e3495-00015c40
disk: /dev/sda
- name: "worker-3"
ipaddr: "9.114.97.214"
macaddr: "fa:f0:63:c7:9c:20"
pvmcec: C340F2U01-ZZ
pvmlpar: rdr-suraj-abi-2c35b112-00015c43
disk: /dev/sda
########################
force_ocp_download: true
######################
# URL path to RHCOS download site
rhcos_arch: "ppc64le"
rhcos_base_url: "https://mirror.openshift.com/pub/openshift-v4/{{ rhcos_arch }}/dependencies/rhcos"
rhcos_rhcos_base: "4.18"
rhcos_rhcos_tag: "4.18.1"
rhcos_iso: "{{ rhcos_base_url}}/{{ rhcos_rhcos_base }}/{{ rhcos_rhcos_tag }}/rhcos-live.{{ rhcos_arch }}.iso"
rhcos_rootfs: "{{ rhcos_base_url}}/{{ rhcos_rhcos_base }}/{{ rhcos_rhcos_tag }}/rhcos-live-rootfs.{{ rhcos_arch }}.img"
rhcos_initramfs: "{{ rhcos_base_url}}/{{ rhcos_rhcos_base }}/{{ rhcos_rhcos_tag }}/rhcos-live-initramfs.{{ rhcos_arch }}.img"
rhcos_kernel: "{{ rhcos_base_url}}/{{ rhcos_rhcos_base }}/{{ rhcos_rhcos_tag }}/rhcos-live-kernel-{{ rhcos_arch }}"
ocp_client_arch: "ppc64le"
ocp_base_url: "https://mirror.openshift.com/pub/openshift-v4/multi/clients"
ocp_client_base: "ocp-dev-preview"
ocp_client_tag: "4.19.0-ec.3"
ocp_client: "{{ ocp_base_url}}/{{ ocp_client_base }}/{{ ocp_client_tag }}/{{ ocp_client_arch }}/openshift-client-linux.tar.gz"
ocp_installer: "{{ ocp_base_url}}/{{ ocp_client_base }}/{{ ocp_client_tag }}/{{ ocp_client_arch }}/openshift-install-linux.tar.gz"
pvm_hmc: hscroot@9.114.195.140
install_type: assisted
assisted_url: "https://api.openshift.com/api/assisted-install/v2"
assisted_token: ""
assisted_ocp_version: "4.19.0-ec.3-multi"
assisted_rhcos_version: "4.19.0-ec.3"
pull_secret: '{{ lookup("file", "~/.openshift/pull-secret") | from_json | to_json }}'
public_ssh_key: "{{ lookup('file', '~/.ssh/id_rsa.pub') }}"
# need to use absolute path for workdir
workdir: "/root/ocp4-{{ install_type }}"
log_level: info
2. Observe deployment and monitor AI events and logs as well as journalctl on the target hardware
Note: when used rhcos_rhcos_tag as 4.18.0-rc.2 the deployment is successful
Actual results:
Deployment fails with bootkube service failing with "oc command not found" error
Expected results:
Deployment should succeed.
Additional info:
[core@master-2 ~]$ journalctl -b -f -u release-image.service -u bootkube.service
Mar 19 06:45:40 master-2 podman[46748]: 2025-03-19 06:45:40.283028622 +0000 UTC m=+0.065964633 container start f4d198fff7d75b665adfb7054fb3499d70ed21bcc5f05564c756ef4024d47e9c (image=quay.io/openshift-release-dev/ocp-release@sha256:fb8754ad482b4932229d96530c848c84b14a7bdba6f47e739121151f977a5ae8, name=youthful_colden, io.openshift.release.base-image-digest=sha256:2d4dcf5920f9e55cf6276d07b2898d4c2c6c5cc7071b37abcb4376e09777a21d, io.openshift.release=4.19.0-ec.3)
Mar 19 06:45:40 master-2 podman[46748]: 2025-03-19 06:45:40.283785915 +0000 UTC m=+0.066721964 container attach f4d198fff7d75b665adfb7054fb3499d70ed21bcc5f05564c756ef4024d47e9c (image=quay.io/openshift-release-dev/ocp-release@sha256:fb8754ad482b4932229d96530c848c84b14a7bdba6f47e739121151f977a5ae8, name=youthful_colden, io.openshift.release.base-image-digest=sha256:2d4dcf5920f9e55cf6276d07b2898d4c2c6c5cc7071b37abcb4376e09777a21d, io.openshift.release=4.19.0-ec.3)
Mar 19 06:45:40 master-2 youthful_colden[46777]: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23fbfe9e7d08ef8d2d3af6b4725702e649d07066e54e99438a9253ab85e49c2a
Mar 19 06:45:40 master-2 podman[46748]: 2025-03-19 06:45:40.325459394 +0000 UTC m=+0.108395446 container died f4d198fff7d75b665adfb7054fb3499d70ed21bcc5f05564c756ef4024d47e9c (image=quay.io/openshift-release-dev/ocp-release@sha256:fb8754ad482b4932229d96530c848c84b14a7bdba6f47e739121151f977a5ae8, name=youthful_colden, io.openshift.release=4.19.0-ec.3, io.openshift.release.base-image-digest=sha256:2d4dcf5920f9e55cf6276d07b2898d4c2c6c5cc7071b37abcb4376e09777a21d)
Mar 19 06:45:40 master-2 podman[46748]: 2025-03-19 06:45:40.241401821 +0000 UTC m=+0.024337833 image pull b299fe51fb2e157ab7d8152b94b4a303ae422fc8e90cb9a81e67c06f5035a56d quay.io/openshift-release-dev/ocp-release@sha256:fb8754ad482b4932229d96530c848c84b14a7bdba6f47e739121151f977a5ae8
Mar 19 06:45:40 master-2 podman[46748]: 2025-03-19 06:45:40.338256371 +0000 UTC m=+0.121192397 container remove f4d198fff7d75b665adfb7054fb3499d70ed21bcc5f05564c756ef4024d47e9c (image=quay.io/openshift-release-dev/ocp-release@sha256:fb8754ad482b4932229d96530c848c84b14a7bdba6f47e739121151f977a5ae8, name=youthful_colden, io.openshift.release=4.19.0-ec.3, io.openshift.release.base-image-digest=sha256:2d4dcf5920f9e55cf6276d07b2898d4c2c6c5cc7071b37abcb4376e09777a21d)
Mar 19 06:45:40 master-2 bootkube.sh[46815]: /usr/local/bin/bootkube.sh: line 86: oc: command not found
Mar 19 06:45:40 master-2 systemd[1]: bootkube.service: Main process exited, code=exited, status=127/n/a
Mar 19 06:45:40 master-2 systemd[1]: bootkube.service: Failed with result 'exit-code'.
Mar 19 06:45:40 master-2 systemd[1]: bootkube.service: Consumed 2.830s CPU time.
Mar 19 06:45:45 master-2 systemd[1]: bootkube.service: Scheduled restart job, restart counter is at 33.
Mar 19 06:45:45 master-2 systemd[1]: Stopped Bootstrap a Kubernetes cluster.
Mar 19 06:45:45 master-2 systemd[1]: bootkube.service: Consumed 2.830s CPU time.
[core@master-2 ~]$ sudo systemctl status node-image-overlay
○ node-image-overlay.service - Node Image Overlay
Loaded: loaded (/etc/systemd/system/node-image-overlay.service; static)
Active: inactive (dead)
[core@master-2 ~]$ sudo systemctl status node-image-pull
○ node-image-pull.service - Node Image Pull
Loaded: loaded (/etc/systemd/system/node-image-pull.service; static)
Active: inactive (dead)
[core@master-2 ~]$ cat /etc/os-release
NAME="Red Hat Enterprise Linux CoreOS"
VERSION="9.6.20250121-0 (Plow)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="9.6"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Red Hat Enterprise Linux CoreOS 9.6.20250121-0 (Plow)"
ANSI_COLOR="0;31"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:redhat:enterprise_linux:9::baseos"
HOME_URL="https://www.redhat.com/"
DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9"
BUG_REPORT_URL="https://issues.redhat.com/"
REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 9"
REDHAT_BUGZILLA_PRODUCT_VERSION=9.6
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.6 Beta"
OSTREE_VERSION='9.6.20250121-0'
VARIANT=CoreOS
VARIANT_ID=coreos