Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-33620

[2241637] [4.14] DV status is 'ImportScheduled' when importer-prime pod FailedMount, VMI status is 'Scheduling'

XMLWordPrintable

    • Storage Core Sprint 243, Storage Core Sprint 244, Storage Core Sprint 245, Storage Core Sprint 246
    • Medium
    • No

      Description of problem:
      When we create a VM on a node without HPP pod,
      importer-prime pod shows FailedMount in the events, but
      DV's status is 'ImportScheduled', which is misleading.
      It leads to VM status 'Provisioning' and VMI status 'Scheduling',
      although there's a clear issue with the storage.
      DV should report 'Pending' status or some other error status,
      and VMI should be 'Pending'.

      Version-Release number of selected component (if applicable):
      4.14, 4.15

      How reproducible:
      Always

      Steps to Reproduce:
      1. Add nodeSelector to hostpath-provisioner CR:

      $ oc edit hostpathprovisioner hostpath-provisioner

      kind: HostPathProvisioner
      metadata:
        name: hostpath-provisioner
      spec:
        workload:
          nodeSelector:
            kubernetes.io/hostname: c01-jp415-57lmz-worker-0-24q58
      

      2. Wait for only 1 'hostpath-provisioner-csi' pod to be Running.

      3. Create a VM on another node:

      apiVersion: kubevirt.io/v1
      kind: VirtualMachine
      metadata:
        name: vm-cirros-source-hpp
      spec:
        dataVolumeTemplates:
        - metadata:
            name: cirros-dv-source-hpp
          spec:
            storage:
              resources:
                requests:
                  storage: 1Gi
              storageClassName: hostpath-csi-basic
            source:
              http:
                url: <cirros.qcow2>
        running: true
        template:
          spec:
            nodeSalactor: 
              kubernetes.io/hostname: c01-jp415-57lmz-worker-0-vrtp2
            domain:
              devices:
                disks:
                - disk:
                    bus: virtio
                  name: datavolumev-hpp
              machine:
                type: ""
              resources:
                requests:
                  memory: 100M
            terminationGracePeriodSeconds: 0
            volumes:
            - dataVolume:
                name: cirros-dv-source-hpp
              name: datavolumev-hpp
      

      Actual results:

      DV is ImportScheduled
      PVC is Pending
      PVC-prime is Pending
      VM is Provisioning
      VMI is Scheduling
      importer-prime pod has FailedMount Event
      virt-launcher pod doesn't have any Events

      $ oc get dv 
      NAME                   PHASE             PROGRESS   RESTARTS   AGE
      cirros-dv-source-hpp   ImportScheduled   N/A                   4m55s
      
      $ oc get pvc 
      NAME                                         STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS         AGE
      cirros-dv-source-hpp                         Pending                                      hostpath-csi-basic   5m2s
      prime-3148ae76-3873-42cf-91c7-5a86ac2324ea   Pending                                      hostpath-csi-basic   5m1s
      
      $ oc get vm 
      NAME                   AGE    STATUS         READY
      vm-cirros-source-hpp   5m7s   Provisioning   False
      
      $ oc get vmi 
      NAME                   AGE     PHASE        IP    NODENAME   READY
      vm-cirros-source-hpp   5m11s   Scheduling                    False
      
      $ oc get pods
      NAME                                                  READY   STATUS              RESTARTS   AGE
      importer-prime-3148ae76-3873-42cf-91c7-5a86ac2324ea   0/1     ContainerCreating   0          5m21s
      virt-launcher-vm-cirros-source-hpp-5tj99              0/1     Pending             0          5m21s
      
      $ oc describe pod importer-prime-3148ae76-3873-42cf-91c7-5a86ac2324ea | grep Events -A 10
      Events:
        Type     Reason       Age                   From     Message
        ----     ------       ----                  ----     -------
        Warning  FailedMount  41s (x26 over 5m56s)  kubelet  Unable to attach or mount volumes: unmounted volumes=[cdi-data-vol], unattached volumes=[], failed to process volumes=[cdi-data-vol]: error processing PVC default/prime-3148ae76-3873-42cf-91c7-5a86ac2324ea: PVC is not bound
      
      $ oc describe pod virt-launcher-vm-cirros-source-hpp-5tj99  | grep Events -A 10
      Events:          <none>
      

      Expected results:

      DV is Pending or some other error state
      PVC is Pending (as it is)
      PVC-prime is Pending (as it is)
      VM is WaitingForVolumeBinding or some other state, indicating there's an issue
      VMI is Pending
      virt-launcher pod has an Event, indicating there's an issue

      How it was in 4.13, before the populators:

      $ oc get dv 
      NAME      PHASE                  PROGRESS   RESTARTS   AGE
      dv-5717   WaitForFirstConsumer   N/A                   84m
      
      $ oc get pvc 
      NAME      STATUS    VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS         AGE
      dv-5717   Pending                                      hostpath-csi-basic   84m
      
      $ oc get vm 
      NAME                         AGE   STATUS                    READY
      vm-5717-1696165923-4705606   84m   WaitingForVolumeBinding   False
      
      $ oc get vmi
      NAME                         AGE   PHASE     IP    NODENAME   READY
      vm-5717-1696165923-4705606   84m   Pending                    False
      
      $ oc get pods
      NAME                                             READY   STATUS    RESTARTS   AGE
      virt-launcher-vm-5717-1696165923-4705606-r8fsj   0/1     Pending   0          84m
      
      $ oc describe pods virt-launcher-vm-5717-1696165923-4705606-r8fsj | grep Events -A 10
      Events:
        Type     Reason            Age                  From               Message
        ----     ------            ----                 ----               -------
        Warning  FailedScheduling  5m21s (x8 over 76m)  default-scheduler  running PreBind plugin "VolumeBinding": binding volumes: timed out waiting for the condition
      

      Additional info:

              rh-ee-alromero Alvaro Romero
              jpeimer@redhat.com Jenia Peimer
              Jenia Peimer Jenia Peimer
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: