Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-8955

vSphere clusterdeployment requires credentials duplication in install-config.yaml

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • 4.8
    • Hive
    • Low
    • None
    • All
    • If docs needed, set a value

      Description of problem:

      When deploying to AWS and Azure it is sufficient to provide a platform.PLAT.credentialsSecretRef.

      When deploying to vSphere the credentials must be duplicated within the installConfigSecretRef secret.

      Version-Release number of selected component (if applicable):

      ACM 2.3.1
      OCP 4.8.4

      How reproducible:

      100%

      Steps to Reproduce:
      1. Create vsphere-creds and vsphere-certs secrets
      2. Use following ClusterDeployment and install-config.yaml
      3. Omit install-config.platform.vsphere.password to fail.


      apiVersion: hive.openshift.io/v1
      kind: ClusterDeployment
      metadata:
      name: demo-lab-domain-com
      labels:
      cloud: VSphere
      region: oak
      vendor: OpenShift
      usage: development
      spec:
      baseDomain: lab.domain.com
      clusterName: demo
      controlPlaneConfig:
      servingCertificates: {}
      installAttemptsLimit: 1
      installed: false
      platform:
      vsphere:
      credentialsSecretRef:
      name: vsphere-creds
      certificatesSecretRef:
      name: vsphere-certs
      cluster: Goat
      datacenter: Garden
      defaultDatastore: VMData-HD
      vCenter: vcenter.lab.domain.com
      network: lab-192-168-4-0-b24
      provisioning:
      installConfigSecretRef:
      name: install-config
      sshPrivateKeySecretRef:
      name: ssh-private-key
      imageSetRef:
      #quay.io/openshift-release-dev/ocp-release:4.8.5-x86_64
      name: img4.8.5-x86-64-appsub
      pullSecretRef:
      name: pull-secret


      apiVersion: v1
      metadata:
      name: 'demo'
      baseDomain: lab.domain.com
      controlPlane:
      hyperthreading: Enabled
      name: master
      replicas: 1
      platform:
      vsphere:
      cpus: 8
      coresPerSocket: 2
      memoryMB: 24552
      osDisk:
      diskSizeGB: 100
      compute:

      • hyperthreading: Enabled
        name: 'worker'
        replicas: 0
        platform:
        vsphere:
        apiVIP: 192.168.4.15
        ingressVIP: 192.168.4.14
        network: lab-192-168-4-0-b24
        datacenter: Garden
        cluster: Goat
        defaultDatastore: VMData-HD
        vCenter: vcenter.lab.domain.com
      1. omit to fail. include to pass.
        #password: redacted-password
        username: administrator@vcenter.lab.domain.com
        pullSecret: ""

      Actual results:

      Hive exit status 1 with log that does not mention the lack of credentials. If other fields such as Cluster are omitted from install-config.yaml they are however noted in the log.

      $ oc logs -n demo-lab-domain-com demo-lab-domain-com-0-rfmxs-provision-42bsf -c hive
      '/vsphere/./..2021_08_20_18_33_34.196534396' -> '/etc/pki/ca-trust/source/anchors/./..2021_08_20_18_33_34.196534396'
      '/vsphere/./..2021_08_20_18_33_34.196534396/.cacert' -> '/etc/pki/ca-trust/source/anchors/./..2021_08_20_18_33_34.196534396/.cacert'
      '/vsphere/./.cacert' -> '/etc/pki/ca-trust/source/anchors/./.cacert'
      '/vsphere/./..data' -> '/etc/pki/ca-trust/source/anchors/./..data'
      time="2021-08-20T18:33:37Z" level=debug msg="Couldn't find install logs provider environment variable. Skipping."
      I0820 18:33:38.858118 1 request.go:655] Throttling request took 1.178237117s, request: GET:https://172.30.0.1:443/apis/user.openshift.io/v1?timeout=32s
      time="2021-08-20T18:33:43Z" level=debug msg="checking for SSH private key" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=info msg="initializing ssh agent with 1 keys" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=debug msg="no SSH_AUTH_SOCK defined. starting ssh-agent" installID=cldsjjhs
      Identity added: /tmp/ssh-privatekey (acm@oak-1508)
      time="2021-08-20T18:33:43Z" level=info msg="added ssh private key to agent" installID=cldsjjhs key=/tmp/ssh-privatekey
      time="2021-08-20T18:33:43Z" level=info msg="waiting for files to be available: [/output/openshift-install /output/oc]" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=info msg="found file" installID=cldsjjhs path=/output/openshift-install
      time="2021-08-20T18:33:43Z" level=info msg="found file" installID=cldsjjhs path=/output/oc
      time="2021-08-20T18:33:43Z" level=info msg="all files found, ready to proceed" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=info msg="copied /output/openshift-install to /home/hive/openshift-install" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=info msg="copied /output/oc to /home/hive/oc" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=info msg="copying install-config.yaml" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=info msg="waiting for files to be available: [/output/.openshift_install.log]" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=info msg="copied /installconfig/install-config.yaml to /output/install-config.yaml" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=info msg="cleaning up from past install attempts" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=warning msg="skipping cleanup as no infra ID set" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=debug msg="object does not exist" installID=cldsjjhs object=demo-lab-domain-com/demo-lab-domain-com-0-rfmxs-admin-kubeconfig
      time="2021-08-20T18:33:43Z" level=debug msg="object does not exist" installID=cldsjjhs object=demo-lab-domain-com/demo-lab-domain-com-0-rfmxs-admin-password
      time="2021-08-20T18:33:43Z" level=info msg="generating assets" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=info msg="running openshift-install create manifests" installID=cldsjjhs
      time="2021-08-20T18:33:43Z" level=info msg="running openshift-install binary" args="[create manifests]" installID=cldsjjhs
      time="2021-08-20T18:33:44Z" level=info msg="found file" installID=cldsjjhs path=/output/.openshift_install.log
      time="2021-08-20T18:33:44Z" level=info msg="all files found, ready to proceed" installID=cldsjjhs
      time="2021-08-20T18:33:44Z" level=debug msg="OpenShift Installer v4.8.0"
      time="2021-08-20T18:33:44Z" level=debug msg="Built from commit 54c7628be380fcb568262dd49a4636da2e0baa21"
      time="2021-08-20T18:33:44Z" level=debug msg="Fetching Master Machines..."
      time="2021-08-20T18:33:44Z" level=debug msg="Loading Master Machines..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Cluster ID..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Install Config..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading SSH Key..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Base Domain..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Platform..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Cluster Name..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Base Domain..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Platform..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Networking..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Platform..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Pull Secret..."
      time="2021-08-20T18:33:44Z" level=debug msg=" Loading Platform..."
      REDACTED LINE OF OUTPUT
      time="2021-08-20T18:33:45Z" level=error msg="error after waiting for command completion" error="exit status 1" installID=cldsjjhs
      time="2021-08-20T18:33:45Z" level=error msg="error generating installer assets" error="exit status 1" installID=cldsjjhs
      time="2021-08-20T18:33:45Z" level=info msg="reading installer log" installID=cldsjjhs
      time="2021-08-20T18:33:45Z" level=info msg="saving installer output" installID=cldsjjhs
      time="2021-08-20T18:33:45Z" level=debug msg="installer console log: REDACTED LINE OF OUTPUT\n" installID=cldsjjhs
      time="2021-08-20T18:33:45Z" level=info msg="updating clusterprovision" installID=cldsjjhs
      time="2021-08-20T18:33:45Z" level=fatal msg="runtime error" error="exit status 1"

      $ oc get clusterprovision demo-lab-domain-com-0-rfmxs -n demo-lab-domain-com -o yaml | yq eval '.status' -
      conditions:

      • lastProbeTime: "2021-08-20T18:33:34Z"
        lastTransitionTime: "2021-08-20T18:33:34Z"
        message: Install job has been created
        reason: JobCreated
        status: "True"
        type: ClusterProvisionJobCreated
      • lastProbeTime: "2021-08-20T18:33:45Z"
        lastTransitionTime: "2021-08-20T18:33:45Z"
        message: |
        REDACTED LINE OF OUTPUT
        reason: UnknownError
        status: "True"
        type: ClusterProvisionFailed
        jobRef:
        name: demo-lab-domain-com-0-rfmxs-provision

      $ oc get clusterdeployments demo-lab-domain-com -o yaml | yq eval '.status' -
      cliImage: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:7fbce98e2cba48997a0cf2a615e6954ae10ab08067570d60e0d0c8e40f5e83d4
      conditions:

      • lastProbeTime: "2021-08-20T18:33:46Z"
        lastTransitionTime: "2021-08-20T18:33:46Z"
        message: |
        REDACTED LINE OF OUTPUT
        reason: UnknownError
        status: "True"
        type: ProvisionFailed
      • lastProbeTime: "2021-08-20T18:34:45Z"
        lastTransitionTime: "2021-08-20T18:34:45Z"
        message: Install attempts limit reached
        reason: InstallAttemptsLimitReached
        status: "True"
        type: ProvisionStopped
      • lastProbeTime: "2021-08-20T18:33:21Z"
        lastTransitionTime: "2021-08-20T18:33:21Z"
        message: Platform credentials passed authentication check
        reason: PlatformAuthSuccess
        status: "False"
        type: AuthenticationFailure
        ...

      Expected results:

      Hive should leverage the ClusterDeployment.platform.vsphere.credentialsSecretRef and permit the omission of vCenter credentials in the install-config. At a minimum, Hive would ideally log the reason for failure.

      Additional info:

      It's not entirely clear how many fields should be delegated to the ClusterDeployment vs the install-config.yaml, but AWS and Azure (at least) seem to tolerate a more reduced install-config. Omitting password is not the only way to cause a failure.

              openshift_jira_bot OpenShift Jira Bot
              dbewley@redhat.com Dale Bewley
              Red Hat Employee
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: