-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
4.8
-
Low
-
None
-
All
-
If docs needed, set a value
Description of problem:
When deploying to AWS and Azure it is sufficient to provide a platform.PLAT.credentialsSecretRef.
When deploying to vSphere the credentials must be duplicated within the installConfigSecretRef secret.
Version-Release number of selected component (if applicable):
ACM 2.3.1
OCP 4.8.4
How reproducible:
100%
Steps to Reproduce:
1. Create vsphere-creds and vsphere-certs secrets
2. Use following ClusterDeployment and install-config.yaml
3. Omit install-config.platform.vsphere.password to fail.
—
apiVersion: hive.openshift.io/v1
kind: ClusterDeployment
metadata:
name: demo-lab-domain-com
labels:
cloud: VSphere
region: oak
vendor: OpenShift
usage: development
spec:
baseDomain: lab.domain.com
clusterName: demo
controlPlaneConfig:
servingCertificates: {}
installAttemptsLimit: 1
installed: false
platform:
vsphere:
credentialsSecretRef:
name: vsphere-creds
certificatesSecretRef:
name: vsphere-certs
cluster: Goat
datacenter: Garden
defaultDatastore: VMData-HD
vCenter: vcenter.lab.domain.com
network: lab-192-168-4-0-b24
provisioning:
installConfigSecretRef:
name: install-config
sshPrivateKeySecretRef:
name: ssh-private-key
imageSetRef:
#quay.io/openshift-release-dev/ocp-release:4.8.5-x86_64
name: img4.8.5-x86-64-appsub
pullSecretRef:
name: pull-secret
—
apiVersion: v1
metadata:
name: 'demo'
baseDomain: lab.domain.com
controlPlane:
hyperthreading: Enabled
name: master
replicas: 1
platform:
vsphere:
cpus: 8
coresPerSocket: 2
memoryMB: 24552
osDisk:
diskSizeGB: 100
compute:
- hyperthreading: Enabled
name: 'worker'
replicas: 0
platform:
vsphere:
apiVIP: 192.168.4.15
ingressVIP: 192.168.4.14
network: lab-192-168-4-0-b24
datacenter: Garden
cluster: Goat
defaultDatastore: VMData-HD
vCenter: vcenter.lab.domain.com
- omit to fail. include to pass.
#password: redacted-password
username: administrator@vcenter.lab.domain.com
pullSecret: ""
Actual results:
Hive exit status 1 with log that does not mention the lack of credentials. If other fields such as Cluster are omitted from install-config.yaml they are however noted in the log.
$ oc logs -n demo-lab-domain-com demo-lab-domain-com-0-rfmxs-provision-42bsf -c hive
'/vsphere/./..2021_08_20_18_33_34.196534396' -> '/etc/pki/ca-trust/source/anchors/./..2021_08_20_18_33_34.196534396'
'/vsphere/./..2021_08_20_18_33_34.196534396/.cacert' -> '/etc/pki/ca-trust/source/anchors/./..2021_08_20_18_33_34.196534396/.cacert'
'/vsphere/./.cacert' -> '/etc/pki/ca-trust/source/anchors/./.cacert'
'/vsphere/./..data' -> '/etc/pki/ca-trust/source/anchors/./..data'
time="2021-08-20T18:33:37Z" level=debug msg="Couldn't find install logs provider environment variable. Skipping."
I0820 18:33:38.858118 1 request.go:655] Throttling request took 1.178237117s, request: GET:https://172.30.0.1:443/apis/user.openshift.io/v1?timeout=32s
time="2021-08-20T18:33:43Z" level=debug msg="checking for SSH private key" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=info msg="initializing ssh agent with 1 keys" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=debug msg="no SSH_AUTH_SOCK defined. starting ssh-agent" installID=cldsjjhs
Identity added: /tmp/ssh-privatekey (acm@oak-1508)
time="2021-08-20T18:33:43Z" level=info msg="added ssh private key to agent" installID=cldsjjhs key=/tmp/ssh-privatekey
time="2021-08-20T18:33:43Z" level=info msg="waiting for files to be available: [/output/openshift-install /output/oc]" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=info msg="found file" installID=cldsjjhs path=/output/openshift-install
time="2021-08-20T18:33:43Z" level=info msg="found file" installID=cldsjjhs path=/output/oc
time="2021-08-20T18:33:43Z" level=info msg="all files found, ready to proceed" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=info msg="copied /output/openshift-install to /home/hive/openshift-install" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=info msg="copied /output/oc to /home/hive/oc" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=info msg="copying install-config.yaml" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=info msg="waiting for files to be available: [/output/.openshift_install.log]" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=info msg="copied /installconfig/install-config.yaml to /output/install-config.yaml" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=info msg="cleaning up from past install attempts" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=warning msg="skipping cleanup as no infra ID set" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=debug msg="object does not exist" installID=cldsjjhs object=demo-lab-domain-com/demo-lab-domain-com-0-rfmxs-admin-kubeconfig
time="2021-08-20T18:33:43Z" level=debug msg="object does not exist" installID=cldsjjhs object=demo-lab-domain-com/demo-lab-domain-com-0-rfmxs-admin-password
time="2021-08-20T18:33:43Z" level=info msg="generating assets" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=info msg="running openshift-install create manifests" installID=cldsjjhs
time="2021-08-20T18:33:43Z" level=info msg="running openshift-install binary" args="[create manifests]" installID=cldsjjhs
time="2021-08-20T18:33:44Z" level=info msg="found file" installID=cldsjjhs path=/output/.openshift_install.log
time="2021-08-20T18:33:44Z" level=info msg="all files found, ready to proceed" installID=cldsjjhs
time="2021-08-20T18:33:44Z" level=debug msg="OpenShift Installer v4.8.0"
time="2021-08-20T18:33:44Z" level=debug msg="Built from commit 54c7628be380fcb568262dd49a4636da2e0baa21"
time="2021-08-20T18:33:44Z" level=debug msg="Fetching Master Machines..."
time="2021-08-20T18:33:44Z" level=debug msg="Loading Master Machines..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Cluster ID..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Install Config..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading SSH Key..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Base Domain..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Platform..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Cluster Name..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Base Domain..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Platform..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Networking..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Platform..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Pull Secret..."
time="2021-08-20T18:33:44Z" level=debug msg=" Loading Platform..."
REDACTED LINE OF OUTPUT
time="2021-08-20T18:33:45Z" level=error msg="error after waiting for command completion" error="exit status 1" installID=cldsjjhs
time="2021-08-20T18:33:45Z" level=error msg="error generating installer assets" error="exit status 1" installID=cldsjjhs
time="2021-08-20T18:33:45Z" level=info msg="reading installer log" installID=cldsjjhs
time="2021-08-20T18:33:45Z" level=info msg="saving installer output" installID=cldsjjhs
time="2021-08-20T18:33:45Z" level=debug msg="installer console log: REDACTED LINE OF OUTPUT\n" installID=cldsjjhs
time="2021-08-20T18:33:45Z" level=info msg="updating clusterprovision" installID=cldsjjhs
time="2021-08-20T18:33:45Z" level=fatal msg="runtime error" error="exit status 1"
$ oc get clusterprovision demo-lab-domain-com-0-rfmxs -n demo-lab-domain-com -o yaml | yq eval '.status' -
conditions:
- lastProbeTime: "2021-08-20T18:33:34Z"
lastTransitionTime: "2021-08-20T18:33:34Z"
message: Install job has been created
reason: JobCreated
status: "True"
type: ClusterProvisionJobCreated - lastProbeTime: "2021-08-20T18:33:45Z"
lastTransitionTime: "2021-08-20T18:33:45Z"
message: |
REDACTED LINE OF OUTPUT
reason: UnknownError
status: "True"
type: ClusterProvisionFailed
jobRef:
name: demo-lab-domain-com-0-rfmxs-provision
$ oc get clusterdeployments demo-lab-domain-com -o yaml | yq eval '.status' -
cliImage: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7fbce98e2cba48997a0cf2a615e6954ae10ab08067570d60e0d0c8e40f5e83d4
conditions:
- lastProbeTime: "2021-08-20T18:33:46Z"
lastTransitionTime: "2021-08-20T18:33:46Z"
message: |
REDACTED LINE OF OUTPUT
reason: UnknownError
status: "True"
type: ProvisionFailed - lastProbeTime: "2021-08-20T18:34:45Z"
lastTransitionTime: "2021-08-20T18:34:45Z"
message: Install attempts limit reached
reason: InstallAttemptsLimitReached
status: "True"
type: ProvisionStopped - lastProbeTime: "2021-08-20T18:33:21Z"
lastTransitionTime: "2021-08-20T18:33:21Z"
message: Platform credentials passed authentication check
reason: PlatformAuthSuccess
status: "False"
type: AuthenticationFailure
...
Expected results:
Hive should leverage the ClusterDeployment.platform.vsphere.credentialsSecretRef and permit the omission of vCenter credentials in the install-config. At a minimum, Hive would ideally log the reason for failure.
Additional info:
It's not entirely clear how many fields should be delegated to the ClusterDeployment vs the install-config.yaml, but AWS and Azure (at least) seem to tolerate a more reduced install-config. Omitting password is not the only way to cause a failure.