-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.19.0
-
None
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
None
-
None
-
None
-
Proposed
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
Deployment of OCP 4.19 cluster using Assisted installer is failing with "bootkube.sh: line 86: oc: command not found" error, when dug deeper observed in the node layer the services node-image-overlay and node-image-pull are in inactive state.
Version-Release number of selected component (if applicable):
Red Hat Enterprise Linux CoreOS 9.6.20250121-0
How reproducible:
Always
Steps to Reproduce:
1. Start Assisted cluster deployment using this repo (https://github.com/cs-zhang/ocp4-ai-powervm.git) and the following var file # cat vars.yaml --- # disk: /dev/sda helper: name: "helper" ipaddr: "9.114.96.246" #networkifacename: "env34" dns: domain: "ai.qa" clusterid: "rdr-suraj-ai-dry4" forwarder1: "9.9.9.9" forwarder2: "8.8.4.4" dhcp: router: "9.114.96.1" netmask: "255.255.252.0" subnet: "9.114.96.0/22" masters: - name: "master-1" ipaddr: "9.114.97.31" macaddr: "fa:3b:2d:34:88:20" pvmcec: C340F2U01-ZZ pvmlpar: rdr-suraj-abi-dced8481-00015c34 disk: /dev/sda - name: "master-2" ipaddr: "9.114.97.25" macaddr: "fa:0f:b4:68:25:20" pvmcec: C340F2U01-ZZ pvmlpar: rdr-suraj-abi-457dbb21-00015c37 disk: /dev/sda - name: "master-3" ipaddr: "9.114.96.249" macaddr: "fa:0e:52:d3:f2:20" pvmcec: C340F2U01-ZZ pvmlpar: rdr-suraj-abi-6c66908c-00015c3a disk: /dev/sda workers: - name: "worker-1" ipaddr: "9.114.97.224" macaddr: "fa:db:f4:ec:b1:20" pvmcec: C340F2U01-ZZ pvmlpar: rdr-suraj-abi-e965cc23-00015c3d disk: /dev/sda - name: "worker-2" ipaddr: "9.114.97.229" macaddr: "fa:dd:d0:b5:b9:20" pvmcec: C340F2U01-ZZ pvmlpar: rdr-suraj-abi-552e3495-00015c40 disk: /dev/sda - name: "worker-3" ipaddr: "9.114.97.214" macaddr: "fa:f0:63:c7:9c:20" pvmcec: C340F2U01-ZZ pvmlpar: rdr-suraj-abi-2c35b112-00015c43 disk: /dev/sda ######################## force_ocp_download: true ###################### # URL path to RHCOS download site rhcos_arch: "ppc64le" rhcos_base_url: "https://mirror.openshift.com/pub/openshift-v4/{{ rhcos_arch }}/dependencies/rhcos" rhcos_rhcos_base: "4.18" rhcos_rhcos_tag: "4.18.1" rhcos_iso: "{{ rhcos_base_url}}/{{ rhcos_rhcos_base }}/{{ rhcos_rhcos_tag }}/rhcos-live.{{ rhcos_arch }}.iso" rhcos_rootfs: "{{ rhcos_base_url}}/{{ rhcos_rhcos_base }}/{{ rhcos_rhcos_tag }}/rhcos-live-rootfs.{{ rhcos_arch }}.img" rhcos_initramfs: "{{ rhcos_base_url}}/{{ rhcos_rhcos_base }}/{{ rhcos_rhcos_tag }}/rhcos-live-initramfs.{{ rhcos_arch }}.img" rhcos_kernel: "{{ rhcos_base_url}}/{{ rhcos_rhcos_base }}/{{ rhcos_rhcos_tag }}/rhcos-live-kernel-{{ rhcos_arch }}" ocp_client_arch: "ppc64le" ocp_base_url: "https://mirror.openshift.com/pub/openshift-v4/multi/clients" ocp_client_base: "ocp-dev-preview" ocp_client_tag: "4.19.0-ec.3" ocp_client: "{{ ocp_base_url}}/{{ ocp_client_base }}/{{ ocp_client_tag }}/{{ ocp_client_arch }}/openshift-client-linux.tar.gz" ocp_installer: "{{ ocp_base_url}}/{{ ocp_client_base }}/{{ ocp_client_tag }}/{{ ocp_client_arch }}/openshift-install-linux.tar.gz" pvm_hmc: hscroot@9.114.195.140 install_type: assisted assisted_url: "https://api.openshift.com/api/assisted-install/v2" assisted_token: "" assisted_ocp_version: "4.19.0-ec.3-multi" assisted_rhcos_version: "4.19.0-ec.3" pull_secret: '{{ lookup("file", "~/.openshift/pull-secret") | from_json | to_json }}' public_ssh_key: "{{ lookup('file', '~/.ssh/id_rsa.pub') }}" # need to use absolute path for workdir workdir: "/root/ocp4-{{ install_type }}" log_level: info 2. Observe deployment and monitor AI events and logs as well as journalctl on the target hardware Note: when used rhcos_rhcos_tag as 4.18.0-rc.2 the deployment is successful
Actual results:
Deployment fails with bootkube service failing with "oc command not found" error
Expected results:
Deployment should succeed.
Additional info:
[core@master-2 ~]$ journalctl -b -f -u release-image.service -u bootkube.service Mar 19 06:45:40 master-2 podman[46748]: 2025-03-19 06:45:40.283028622 +0000 UTC m=+0.065964633 container start f4d198fff7d75b665adfb7054fb3499d70ed21bcc5f05564c756ef4024d47e9c (image=quay.io/openshift-release-dev/ocp-release@sha256:fb8754ad482b4932229d96530c848c84b14a7bdba6f47e739121151f977a5ae8, name=youthful_colden, io.openshift.release.base-image-digest=sha256:2d4dcf5920f9e55cf6276d07b2898d4c2c6c5cc7071b37abcb4376e09777a21d, io.openshift.release=4.19.0-ec.3) Mar 19 06:45:40 master-2 podman[46748]: 2025-03-19 06:45:40.283785915 +0000 UTC m=+0.066721964 container attach f4d198fff7d75b665adfb7054fb3499d70ed21bcc5f05564c756ef4024d47e9c (image=quay.io/openshift-release-dev/ocp-release@sha256:fb8754ad482b4932229d96530c848c84b14a7bdba6f47e739121151f977a5ae8, name=youthful_colden, io.openshift.release.base-image-digest=sha256:2d4dcf5920f9e55cf6276d07b2898d4c2c6c5cc7071b37abcb4376e09777a21d, io.openshift.release=4.19.0-ec.3) Mar 19 06:45:40 master-2 youthful_colden[46777]: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:23fbfe9e7d08ef8d2d3af6b4725702e649d07066e54e99438a9253ab85e49c2a Mar 19 06:45:40 master-2 podman[46748]: 2025-03-19 06:45:40.325459394 +0000 UTC m=+0.108395446 container died f4d198fff7d75b665adfb7054fb3499d70ed21bcc5f05564c756ef4024d47e9c (image=quay.io/openshift-release-dev/ocp-release@sha256:fb8754ad482b4932229d96530c848c84b14a7bdba6f47e739121151f977a5ae8, name=youthful_colden, io.openshift.release=4.19.0-ec.3, io.openshift.release.base-image-digest=sha256:2d4dcf5920f9e55cf6276d07b2898d4c2c6c5cc7071b37abcb4376e09777a21d) Mar 19 06:45:40 master-2 podman[46748]: 2025-03-19 06:45:40.241401821 +0000 UTC m=+0.024337833 image pull b299fe51fb2e157ab7d8152b94b4a303ae422fc8e90cb9a81e67c06f5035a56d quay.io/openshift-release-dev/ocp-release@sha256:fb8754ad482b4932229d96530c848c84b14a7bdba6f47e739121151f977a5ae8 Mar 19 06:45:40 master-2 podman[46748]: 2025-03-19 06:45:40.338256371 +0000 UTC m=+0.121192397 container remove f4d198fff7d75b665adfb7054fb3499d70ed21bcc5f05564c756ef4024d47e9c (image=quay.io/openshift-release-dev/ocp-release@sha256:fb8754ad482b4932229d96530c848c84b14a7bdba6f47e739121151f977a5ae8, name=youthful_colden, io.openshift.release=4.19.0-ec.3, io.openshift.release.base-image-digest=sha256:2d4dcf5920f9e55cf6276d07b2898d4c2c6c5cc7071b37abcb4376e09777a21d) Mar 19 06:45:40 master-2 bootkube.sh[46815]: /usr/local/bin/bootkube.sh: line 86: oc: command not found Mar 19 06:45:40 master-2 systemd[1]: bootkube.service: Main process exited, code=exited, status=127/n/a Mar 19 06:45:40 master-2 systemd[1]: bootkube.service: Failed with result 'exit-code'. Mar 19 06:45:40 master-2 systemd[1]: bootkube.service: Consumed 2.830s CPU time. Mar 19 06:45:45 master-2 systemd[1]: bootkube.service: Scheduled restart job, restart counter is at 33. Mar 19 06:45:45 master-2 systemd[1]: Stopped Bootstrap a Kubernetes cluster. Mar 19 06:45:45 master-2 systemd[1]: bootkube.service: Consumed 2.830s CPU time. [core@master-2 ~]$ sudo systemctl status node-image-overlay ○ node-image-overlay.service - Node Image Overlay Loaded: loaded (/etc/systemd/system/node-image-overlay.service; static) Active: inactive (dead) [core@master-2 ~]$ sudo systemctl status node-image-pull ○ node-image-pull.service - Node Image Pull Loaded: loaded (/etc/systemd/system/node-image-pull.service; static) Active: inactive (dead) [core@master-2 ~]$ cat /etc/os-release NAME="Red Hat Enterprise Linux CoreOS" VERSION="9.6.20250121-0 (Plow)" ID="rhel" ID_LIKE="fedora" VERSION_ID="9.6" PLATFORM_ID="platform:el9" PRETTY_NAME="Red Hat Enterprise Linux CoreOS 9.6.20250121-0 (Plow)" ANSI_COLOR="0;31" LOGO="fedora-logo-icon" CPE_NAME="cpe:/o:redhat:enterprise_linux:9::baseos" HOME_URL="https://www.redhat.com/" DOCUMENTATION_URL="https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9" BUG_REPORT_URL="https://issues.redhat.com/" REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 9" REDHAT_BUGZILLA_PRODUCT_VERSION=9.6 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="9.6 Beta" OSTREE_VERSION='9.6.20250121-0' VARIANT=CoreOS VARIANT_ID=coreos