-
Bug
-
Resolution: Done-Errata
-
Major
-
CNV v4.15.0
-
None
-
1
-
False
-
-
False
-
CNV v4.17.0.rhel9-60, CNV v4.15.3.rhel9-192
-
---
-
---
-
-
Storage Core Sprint 255, Storage Core Sprint 257
-
No
Description of problem:
Can't provision a VM from a boot source image on a different node
Version-Release number of selected component (if applicable):
4.16, 4.15 (LVMS on multi-node)
How reproducible:
Always
Steps to Reproduce:
1. See on which node the boot spurce image is:
$ oc get pv | grep fedora pvc-e604b04f-83dc-46c7-bb63-1633deb7ff59 30Gi RWO Delete Bound openshift-virtualization-os-images/fedora-722ac1d6b4f1 lvms-vg1 <unset> 112m $ oc get pv pvc-e604b04f-83dc-46c7-bb63-1633deb7ff59 -ojson | jq .spec.nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0] { "key": "topology.topolvm.io/node", "operator": "In", "values": [ "c01-jp416-lvms-la2-7x598-worker-0-9z56j" ] }
2. Create a VM on a different node:
$ cat vm-clone-lvms-wrong-node.yaml apiVersion: kubevirt.io/v1 kind: VirtualMachine metadata: name: vm-fix-node namespace: openshift-virtualization-os-images spec: dataVolumeTemplates: - metadata: name: dv-fix-node namespace: openshift-virtualization-os-images spec: storage: resources: requests: storage: 30Gi storageClassName: lvms-vg1 source: pvc: namespace: openshift-virtualization-os-images name: fedora-722ac1d6b4f1 running: true template: spec: nodeSelector: "kubernetes.io/hostname": "c01-jp416-lvms-la2-7x598-worker-0-78pt5" domain: devices: disks: - disk: bus: virtio name: datavolume machine: type: "" resources: requests: memory: 1Gi terminationGracePeriodSeconds: 0 volumes: - dataVolume: name: dv-fix-node name: datavolume
Actual results:
VolumeSnapshot is created, VM is in Provisioning state, VMI is Scheduling, and DV can't be cloned.
$ oc get vm -A NAMESPACE NAME AGE STATUS READY openshift-virtualization-os-images vm-fix-node 10m Provisioning False $ oc describe vm -n openshift-virtualization-os-images vm-fix-node ... Message: Not all of the VMI's DVs are ready Reason: NotAllDVsReady Status: False Type: DataVolumesReady Last Probe Time: <nil> Last Transition Time: 2024-04-21T14:27:08Z Message: running PreBind plugin "VolumeBinding": binding volumes: context deadline exceeded Reason: SchedulerError Status: False Type: PodScheduled
$ oc describe dv -n openshift-virtualization-os-images dv-fix-node Name: dv-fix-node Namespace: openshift-virtualization-os-images Labels: instancetype.kubevirt.io/default-instancetype=u1.medium instancetype.kubevirt.io/default-preference=fedora kubevirt.io/created-by=f7f46074-d683-4cae-8466-25139e94aba7 Annotations: cdi.kubevirt.io/allowClaimAdoption: true cdi.kubevirt.io/cloneType: snapshot cdi.kubevirt.io/storage.clone.token: eyJhbGciOiJQUzI1NiJ9.eyJleHAiOjE3MTM3MDkzMjgsImlhdCI6MTcxMzcwOTAyOCwiaXNzIjoiY2RpLWFwaXNlcnZlciIsIm5hbWUiOiJmZWRvcmEtNzIyYWMxZDZiNGYxIiwib... cdi.kubevirt.io/storage.usePopulator: true API Version: cdi.kubevirt.io/v1beta1 Kind: DataVolume Metadata: Creation Timestamp: 2024-04-21T14:17:08Z Generation: 1 Owner References: API Version: kubevirt.io/v1 Block Owner Deletion: true Controller: true Kind: VirtualMachine Name: vm-fix-node UID: f7f46074-d683-4cae-8466-25139e94aba7 Resource Version: 172803 UID: e50b5c09-0c18-4570-9f59-369b99d0dec5 Spec: Source: Pvc: Name: fedora-722ac1d6b4f1 Namespace: openshift-virtualization-os-images Storage: Resources: Requests: Storage: 30Gi Storage Class Name: lvms-vg1 Status: Claim Name: dv-fix-node Conditions: Last Heartbeat Time: 2024-04-21T14:17:08Z Last Transition Time: 2024-04-21T14:17:08Z Message: PVC dv-fix-node Pending Reason: Pending Status: False Type: Bound Last Heartbeat Time: 2024-04-21T14:17:09Z Last Transition Time: 2024-04-21T14:17:08Z Reason: TransferRunning Status: False Type: Ready Last Heartbeat Time: 2024-04-21T14:17:08Z Last Transition Time: 2024-04-21T14:17:08Z Reason: Populator is running Status: True Type: Running Phase: PrepClaimInProgress Progress: N/A Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pending 10m datavolume-pvc-clone-controller PVC dv-fix-node Pending Normal CloneScheduled 10m datavolume-pvc-clone-controller Cloning from openshift-virtualization-os-images/fedora-722ac1d6b4f1 into openshift-virtualization-os-images/dv-fix-node scheduled Normal SnapshotForSmartCloneInProgress 10m datavolume-pvc-clone-controller Creating snapshot for smart-clone is in progress (for pvc openshift-virtualization-os-images/fedora-722ac1d6b4f1) Normal PrepClaimInProgress 10m datavolume-pvc-clone-controller Prepping PersistentVolumeClaim for DataVolume openshift-virtualization-os-images/dv-fix-node
$ oc get pvc -n openshift-virtualization-os-images tmp-pvc-910e0ef5-61c2-4cf1-9b95-35d257b8f37d -oyaml apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: cdi.kubevirt.io/allowClaimAdoption: "true" cdi.kubevirt.io/clonePhase: Snapshot cdi.kubevirt.io/cloneType: snapshot cdi.kubevirt.io/createdForDataVolume: e50b5c09-0c18-4570-9f59-369b99d0dec5 cdi.kubevirt.io/storage.condition.running: "true" cdi.kubevirt.io/storage.condition.running.message: "" cdi.kubevirt.io/storage.condition.running.reason: Populator is running cdi.kubevirt.io/storage.contentType: kubevirt cdi.kubevirt.io/storage.pod.restarts: "0" cdi.kubevirt.io/storage.populator.kind: VolumeCloneSource cdi.kubevirt.io/storage.preallocation.requested: "false" cdi.kubevirt.io/storage.usePopulator: "true" pv.kubernetes.io/bind-completed: "yes" pv.kubernetes.io/bound-by-controller: "yes" volume.beta.kubernetes.io/storage-provisioner: topolvm.io volume.kubernetes.io/selected-node: c01-jp416-lvms-la2-7x598-worker-0-78pt5 volume.kubernetes.io/storage-provisioner: topolvm.io creationTimestamp: "2024-04-21T14:17:09Z" finalizers: - kubernetes.io/pvc-protection labels: app: containerized-data-importer app.kubernetes.io/component: storage app.kubernetes.io/managed-by: cdi-controller app.kubernetes.io/part-of: hyperconverged-cluster app.kubernetes.io/version: 4.16.0 cdi.kubevirt.io/OwnedByUID: 910e0ef5-61c2-4cf1-9b95-35d257b8f37d instancetype.kubevirt.io/default-instancetype: u1.medium instancetype.kubevirt.io/default-preference: fedora kubevirt.io/created-by: f7f46074-d683-4cae-8466-25139e94aba7 name: tmp-pvc-910e0ef5-61c2-4cf1-9b95-35d257b8f37d namespace: openshift-virtualization-os-images resourceVersion: "172835" uid: 6b4e4245-a6fe-4fdb-8ef6-95d0af0a5294 spec: accessModes: - ReadWriteOnce dataSource: apiGroup: snapshot.storage.k8s.io kind: VolumeSnapshot name: tmp-snapshot-910e0ef5-61c2-4cf1-9b95-35d257b8f37d dataSourceRef: apiGroup: snapshot.storage.k8s.io kind: VolumeSnapshot name: tmp-snapshot-910e0ef5-61c2-4cf1-9b95-35d257b8f37d resources: requests: storage: 30Gi storageClassName: lvms-vg1 volumeMode: Block volumeName: pvc-6b4e4245-a6fe-4fdb-8ef6-95d0af0a5294 status: accessModes: - ReadWriteOnce capacity: storage: 30Gi phase: Bound
$ oc get pvc -n openshift-virtualization-os-images dv-fix-node -oyaml apiVersion: v1 kind: PersistentVolumeClaim metadata: annotations: cdi.kubevirt.io/allowClaimAdoption: "true" cdi.kubevirt.io/clonePhase: PrepClaim cdi.kubevirt.io/cloneType: snapshot cdi.kubevirt.io/createdForDataVolume: e50b5c09-0c18-4570-9f59-369b99d0dec5 cdi.kubevirt.io/storage.condition.running: "true" cdi.kubevirt.io/storage.condition.running.message: "" cdi.kubevirt.io/storage.condition.running.reason: Populator is running cdi.kubevirt.io/storage.contentType: kubevirt cdi.kubevirt.io/storage.pod.restarts: "0" cdi.kubevirt.io/storage.preallocation.requested: "false" cdi.kubevirt.io/storage.usePopulator: "true" volume.beta.kubernetes.io/storage-provisioner: topolvm.io volume.kubernetes.io/selected-node: c01-jp416-lvms-la2-7x598-worker-0-78pt5 volume.kubernetes.io/storage-provisioner: topolvm.io creationTimestamp: "2024-04-21T14:17:08Z" finalizers: - kubernetes.io/pvc-protection - cdi.kubevirt.io/clonePopulator labels: app: containerized-data-importer app.kubernetes.io/component: storage app.kubernetes.io/managed-by: cdi-controller app.kubernetes.io/part-of: hyperconverged-cluster app.kubernetes.io/version: 4.16.0 instancetype.kubevirt.io/default-instancetype: u1.medium instancetype.kubevirt.io/default-preference: fedora kubevirt.io/created-by: f7f46074-d683-4cae-8466-25139e94aba7 name: dv-fix-node namespace: openshift-virtualization-os-images ownerReferences: - apiVersion: cdi.kubevirt.io/v1beta1 blockOwnerDeletion: true controller: true kind: DataVolume name: dv-fix-node uid: e50b5c09-0c18-4570-9f59-369b99d0dec5 resourceVersion: "172802" uid: 910e0ef5-61c2-4cf1-9b95-35d257b8f37d spec: accessModes: - ReadWriteOnce dataSource: apiGroup: cdi.kubevirt.io kind: VolumeCloneSource name: volume-clone-source-e50b5c09-0c18-4570-9f59-369b99d0dec5 dataSourceRef: apiGroup: cdi.kubevirt.io kind: VolumeCloneSource name: volume-clone-source-e50b5c09-0c18-4570-9f59-369b99d0dec5 resources: requests: storage: "32212254720" storageClassName: lvms-vg1 volumeMode: Block status: phase: Pending
$ oc get VolumeSnapshot -A NAMESPACE NAME READYTOUSE SOURCEPVC SOURCESNAPSHOTCONTENT RESTORESIZE SNAPSHOTCLASS SNAPSHOTCONTENT CREATIONTIME AGE openshift-virtualization-os-images tmp-snapshot-910e0ef5-61c2-4cf1-9b95-35d257b8f37d true fedora-722ac1d6b4f1 30Gi lvms-vg1 snapcontent-b0c600b7-6bcd-43c2-910f-049f96fb698d 17m 17m
$ oc describe pods -n openshift-virtualization-os-images prep-910e0ef5-61c2-4cf1-9b95-35d257b8f37d | grep Events -A 10 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Warning FailedMount 19m kubelet Unable to attach or mount volumes: unmounted volumes=[cdi-data-vol], unattached volumes=[], failed to process volumes=[cdi-data-vol]: error processing PVC openshift-virtualization-os-images/tmp-pvc-910e0ef5-61c2-4cf1-9b95-35d257b8f37d: PVC is not bound Warning FailedMount <invalid> (x11833 over 19m) kubelet MapVolume.NodeAffinity check failed for volume "pvc-6b4e4245-a6fe-4fdb-8ef6-95d0af0a5294" : no matching NodeSelectorTerms
Expected results:
Fall back to the host-assisted clone
Additional info:
- clones
-
CNV-41081 [4.16] CNV with LVMS on multi-node: cross node cloning fails
- Closed
- is related to
-
CNV-38485 CNV + LVM Storage on multi node clusters [validation]
- Closed
- links to
-
RHEA-2024:136286 OpenShift Virtualization 4.15.4 Images