-
Bug
-
Resolution: Cannot Reproduce
-
Normal
-
None
-
4.16
-
-
-
None
-
False
-
Description of problem:
Installation of OpenShift on OpenStack fails when using a combination of OpenStack Application Credentials and Infrastructure Nodes
Version-Release number of selected component (if applicable):
OpenShift 4.16.6 on top of RHOS-17.1-RHEL-9-20240701.n.1
How reproducible:
Always
Steps to Reproduce:
- Install OpenStack and create Application Credential - Update the Application Credential in the clouds.yaml - Create infrastructure MachineSet during OCP installation - Generate the manifests using the openshift-install ``` openshift-install --log-level=debug create manifests --dir <install_dir> ``` - Create the infra machineset custom resource according to the official Sample YAML for a machine set custom resource on RHOSP documentation and put it at - <install_dir>/openshift/openshift-cluster-infra-machineset.yaml. [0] - Create the OCP cluster using the openshift-install create cluster command - The installation finishes with a failure status. - The machines are on 'Running' phase, but the nodes are not ready: ``` [stack@undercloud-0 ~]$ oc get machines -A NAMESPACE NAME PHASE TYPE REGION ZONE AGE openshift-machine-api ostest-sdbwq-infra-0-2lqgx Running master regionOne nova 4h58m openshift-machine-api ostest-sdbwq-infra-0-klf5q Running master regionOne nova 4h58m openshift-machine-api ostest-sdbwq-infra-0-n4sc9 Running master regionOne nova 4h58m openshift-machine-api ostest-sdbwq-master-0 Running master regionOne nova 5h4m openshift-machine-api ostest-sdbwq-master-1 Running master regionOne nova 5h4m openshift-machine-api ostest-sdbwq-master-2 Running master regionOne nova 5h4m openshift-machine-api ostest-sdbwq-worker-0-4dnj6 Running worker regionOne nova 4h58m openshift-machine-api ostest-sdbwq-worker-0-cct4f Running worker regionOne nova 4h58m openshift-machine-api ostest-sdbwq-worker-0-hswpv Running worker regionOne nova 4h58m [stack@undercloud-0 ~]$ oc get machinesets -A NAMESPACE NAME DESIRED CURRENT READY AVAILABLE AGE openshift-machine-api ostest-sdbwq-infra-0 3 3 5h4m openshift-machine-api ostest-sdbwq-worker-0 3 3 5h4m [stack@undercloud-0 ~]$ oc get nodes NAME STATUS ROLES AGE VERSION ostest-sdbwq-infra-0-2lqgx NotReady infra,worker 4h50m v1.29.6+aba1e8d ostest-sdbwq-infra-0-klf5q NotReady infra,worker 4h50m v1.29.6+aba1e8d ostest-sdbwq-infra-0-n4sc9 NotReady infra,worker 4h50m v1.29.6+aba1e8d ostest-sdbwq-master-0 Ready control-plane,master 5h3m v1.29.6+aba1e8d ostest-sdbwq-master-1 Ready control-plane,master 5h3m v1.29.6+aba1e8d ostest-sdbwq-master-2 Ready control-plane,master 5h3m v1.29.6+aba1e8d ostest-sdbwq-worker-0-4dnj6 NotReady worker 4h48m v1.29.6+aba1e8d ostest-sdbwq-worker-0-cct4f NotReady worker 4h44m v1.29.6+aba1e8d ostest-sdbwq-worker-0-hswpv NotReady worker 4h51m v1.29.6+aba1e8d ``` [0] https://docs.openshift.com/container-platform/4.16/machine_management/creating-infrastructure-machinesets.html
Actual results:
Installation fails
Expected results:
Installation passes with Application Credentials and Infrastructure Nodes
Additional info:
- Using just Application Credentials works as expected - Using just Infrastructure Nodes works as expected - Installation passes with Application Credentials and Infrastructure Nodes when using OpenShift 4.16.5 - it is a regression appears on OCP 4.16.6 - This issue is not appear on OCP 4.17.0-0.nightly-2024-08-09-031511
Debug:
- From machine-controller log -
2024-08-12T07:23:44.480953928Z E0812 07:23:44.480880 1 controller.go:266] ostest-sdbwq-worker-0-cct4f: failed to check if machine exists: Failed to get cloud from secret: Failed to get secrets from kubernetes api: Get "https://172.30.0.1:443/api/v1/namespaces/openshift-machine-api/secrets/openstack-cloud-credentials": stream error: stream ID 197; INTERNAL_ERROR; received from peer 2024-08-12T07:23:56.150558205Z E0812 07:23:56.150503 1 controller.go:329] "msg"="Reconciler error" "error"="Failed to get cloud from secret: Failed to get secrets from kubernetes api: Get \"https://172.30.0.1:443/api/v1/namespaces/openshift-machine-api/secrets/openstack-cloud-credentials\": stream error: stream ID 197; INTERNAL_ERROR; received from peer" "controller"="machine-controller" "name"="ostest-sdbwq-worker-0-cct4f" "namespace"="openshift-machine-api" "object"={"name":"ostest-sdbwq-worker-0-cct4f","namespace":"openshift-machine-api"} "reconcileID"="c867f5cd-a52f-4126-8af5-74204e03b344"
- But we can get the correct secret using the oc CLI and the cerdential is valid using the OSP CLI
[stack@undercloud-0 ~]$ oc get secret -n openshift-machine-api openstack-cloud-credentials -o json | jq -r '.data."clouds.yaml"' | base64 -d clouds: openstack: auth: application_credential_id: <encrypted_application_credential_id> application_credential_secret: <encrypted_application_credential_secret> auth_url: https://overcloud.redhat.local:13000 auth_type: v3applicationcredential cacert: /etc/kubernetes/static-pod-resources/configmaps/cloud-config/ca-bundle.pem identity_api_version: "3" region_name: regionOne verify: true