Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-48473

failed to provision volume with StorageClass "thin-csi-default"

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: Minor Minor
    • None
    • 4.14.z
    • Storage / Operators
    • None
    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Worker nodes were powered off for a network change. After worker nodes were powered on, a couple of worker nodes got into failed state due to service account permission issue with VmWare. Customer recreated nodes by scaling down machinesets and scaling them up again. After this, PVC creation fails with:
      
      ```
      failed to provision volume with StorageClass "thin-csi-default": rpc error: code = Internal desc = failed to get shared datastores in kubernetes cluster. Error: ServerFaultCode: The object 'vim.VirtualMachine:vm-380325' has already been deleted or has not been completely created
      ```     
      
      This seems to be a known issue in upstream vSphere CSI driver issue 2495 which as per [this comment](https://github.com/kubernetes-sigs/vsphere-csi-driver/issues/2495#issuecomment-1716540708), the fix seems to be in place in [vsphere-csi-driver release v3.1.2](https://github.com/kubernetes-sigs/vsphere-csi-driver/compare/v3.1.2...master). 
      
      OCP v4.14 is currently on Extended EUS Support until October 31, 2025, howvere, as per [the official docs](https://docs.openshift.com/container-platform/4.14/storage/container_storage_interface/persistent-storage-csi-vsphere.html),  OCP 4.14 is on vsphere-csi-driver on v3.0.2
       
      
      Can we get rid of the caching issue without having to restart  vmware-vsphere-csi-driver-controller, vmware-vsphere-csi-driver-webhook, & vmware-vsphere-csi-driver-operator pods  by upgrading vSphere CSI driver to version v3.1.2 in OCP 4.14.z versions? 

      Version-Release number of selected component (if applicable):

          OCP 4.14.39

      Actual results:

      After creating new machines by scaling down and scaling up machinesets back again, PVC should get created only after  restarting vmware-vsphere-csi-driver-controller, vmware-vsphere-csi-driver-webhook, & vmware-vsphere-csi-driver-operator pods

      Expected results:

      After creating new machines by scaling down and scaling up machinesets back again, PVC should get created without having to restarting vmware-vsphere-csi-driver-controller, vmware-vsphere-csi-driver-webhook, & vmware-vsphere-csi-driver-operator pods
           

      Additional info:

      Similar bug https://issues.redhat.com/browse/OCPBUGS-27205 which is now Closed

              rhn-engineering-jsafrane Jan Safranek
              rhn-support-vchalise Vibhuti Chalise
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: