Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-25718

vSphere ABI failed due to storage operator degraded

    XMLWordPrintable

Details

    • Important
    • No
    • Sprint 247, Sprint 248, Sprint 249
    • 3
    • Rejected
    • False
    • Hide

      None

      Show
      None
    • Hide
      Cause: The assisted-installer was removing the uninitialized taints for all vSphere nodes preventing the vSphere CCM from initializing the nodes properly.

      Consequence: The vSphere CSI operator is degraded during initial cluster install because the VM's UUID (node's providerID) is missing.

      Fix: The assisted-installer now checks if vSphere credentials were provided in the install-config.yaml. If credentials were provided, the OpenShift version is greater or equal to 4.15, and if the agent installer was used, then the assisted-installer and assisted-installer-controller does not remove the uninitialized taints.

      Result: The node's providerID and VM's UUID are properly set and the vSphere CSI operator is installed.
      Show
      Cause: The assisted-installer was removing the uninitialized taints for all vSphere nodes preventing the vSphere CCM from initializing the nodes properly. Consequence: The vSphere CSI operator is degraded during initial cluster install because the VM's UUID (node's providerID) is missing. Fix: The assisted-installer now checks if vSphere credentials were provided in the install-config.yaml. If credentials were provided, the OpenShift version is greater or equal to 4.15, and if the agent installer was used, then the assisted-installer and assisted-installer-controller does not remove the uninitialized taints. Result: The node's providerID and VM's UUID are properly set and the vSphere CSI operator is installed.
    • Bug Fix
    • Proposed

    Description

      Description of problem:

      The degradation of the storage operator occurred because it couldn't locate the node by UUID. I noticed that the providerID was present for node 0, but it was blank for other nodes. A successful installation can be achieved on day 2 by executing step 4 after step 7 from this document: https://access.redhat.com/solutions/6677901. Additionally, if we provide credentials from the install-config, it's necessary to add a taint to the node using the uninitialized taint(oc adm taint node "$NODE" node.cloudprovider.kubernetes.io/uninitialized=true:NoSchedule) after the bootstrap completed.

      Version-Release number of selected component (if applicable):

      4.15

      How reproducible:

      100%

      Steps to Reproduce:

          1. Create an agent ISO image
          2. Boot the created ISO on vSphere VM    

      Actual results:

      Installation is failing due to storage operator unable to find the node by UUID.

      Expected results:

      Storage operator should be installed without any issue.

      Additional info:

      Slack discussion: https://redhat-internal.slack.com/archives/C02SPBZ4GPR/p1702893456002729

      Attachments

        Issue Links

          Activity

            People

              rwsu1@redhat.com Richard Su
              rhn-support-mhans Manoj Hans
              Manoj Hans Manoj Hans
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated: