Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-36247

Kubevirt-CSI does not work when an Infra cluster is used for the VMs

XMLWordPrintable

    • Quality / Stability / Reliability
    • False
    • Hide

      None

      Show
      None
    • None
    • Important
    • No
    • None
    • None
    • None
    • None
    • Bug Fix
    • Hide
      RN not needed
      -----
      Fix a bug in which kubevirt-csi-driver was not provided with external infra cluster credentials and namespace.
      Show
      RN not needed ----- Fix a bug in which kubevirt-csi-driver was not provided with external infra cluster credentials and namespace.
    • None
    • None
    • None
    • None

      This is a clone of issue OCPBUGS-36188. The following is the description of the original issue:

      Description of problem:

      The scenario is the following:
      Cluster MGMT:

      • OCP 4.15.17
      • MCE 2.5.3

      Cluster INFRA

      • OCP 4.15.17
      • CNV 4.15.2

      Deploy a third cluster, called HOSTED, with a hosted-control plane on MGMT cluster with kubevirt provider, selecting INFRA cluster as an external infra provider for the VMs, and use kubevirt-csi to map a storage class from the INFRA cluster (lvms-vgs1) to the HOSTED cluster like this:

      hcp create cluster kubevirt ... \  
      --infra-kubeconfig-file=KUBECONFIG_INFRA_CLUSTER 
      --infra-storage-class-mapping=lvms-vg1/lvms-infra
      

      This doesn't work at all.

      First, the the kubevirt-csi pod runs on the MGMT cluster along with the control plane components, I'm not sure this is intended.

      clusters-hostedcluster                             kubevirt-csi-controller-76547dd45-wqzd9                      4/4     Running     0             17h
      

      Now here comes the things that are broken.

      1) If MGMT cluster doesn't have CNV installed (it doesn't have to), the kubevirt-csi on the MGMT cluster fails very early, when trying to create a DataVolume for the HOSTED cluster, because DataVolume API doesn't exist.

      1 controller.go:816] CreateVolume failed, supports topology = false, node selected false => may reschedule = false => state = Finished: rpc error: code = Unknown desc = the server could not find the requested resource (post datavolumes.cdi.kubevirt.io)
      

      2) If one has CNV installed on the MGMT cluster, then the pod is able to create the DataVolume, but its on the MGMT cluster next to the control plane

      $ oc get -n clusters-hostedcluster dv
      NAME                                       PHASE   PROGRESS   RESTARTS   AGE
      pvc-3ab18f9e-264b-47b2-a33f-26138a39e5bd                                 16h
      

      And unless the MGMT cluster has a storage class with the exact same name as one of the INFRA cluster storage classes (where I believe this PVC should be allocated), it will just get stuck because the mapped StorageClass doesn't exist on MGMT

      $ oc get -n clusters-hostedcluster dv pvc-3ab18f9e-264b-47b2-a33f-26138a39e5bd -o yaml | yq '.status.conditions'
      [
        {
          "lastHeartbeatTime": "2024-06-12T06:18:38Z",
          "lastTransitionTime": "2024-06-12T06:18:38Z",
          "message": "DataVolume.storage spec is missing accessMode and no storageClass to choose profile",
          "reason": "ErrClaimNotValid",
          "status": "Unknown",
          "type": "Bound"
        },
        {
          "lastHeartbeatTime": "2024-06-12T22:22:07Z",
          "lastTransitionTime": "2024-06-12T06:18:38Z",
          "message": "DataVolume.storage spec is missing accessMode and no storageClass to choose profile",
          "reason": "ErrClaimNotValid",
          "status": "False",
          "type": "Ready"
        },
        {
          "lastHeartbeatTime": "2024-06-12T06:18:38Z",
          "lastTransitionTime": "2024-06-12T06:18:38Z",
          "status": "False",
          "type": "Running"
        }
      ]
      

      And if one actually uses a StorageClass that exists on both MGMT and INFRA, then it does go ahead and create the DV/PVC, but fails to attach to the VM as the VM does not exist on MGMT cluster, it is on INFRA cluster

      AttachVolume.Attach failed for volume "pvc-7e590f2e-9014-48e1-9104-f35fe317d840" : rpc error: code = NotFound desc = failed to find VM with domain.firmware.uuid 7ebd0842-17b7-5803-a01d-cacfc0fb7d9e
      

      It feels like the kubevirt-csi should be deployed on the INFRA cluster, not MGMT, or its using the wrong API endpoint and reaching its own cluster instead of the external INFRA ones.

      Version-Release number of selected component (if applicable):
      As above

      How reproducible:
      Always

      Steps to Reproduce:
      Deploy kubevirt-csi with external infra

      Actual results:
      Unable to use kubevirt-csi volumes on HOSTED cluster

              alitke@redhat.com Adam Litke
              openshift-crt-jira-prow OpenShift Prow Bot
              None
              None
              Liangquan Li Liangquan Li
              None
              Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

                Created:
                Updated:
                Resolved: