-
Bug
-
Resolution: Unresolved
-
Normal
-
OADP 1.3.0
-
4
-
False
-
-
False
-
ToDo
-
-
-
0
-
0.000
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
No
Description of problem:
When restoring an application with more than 2 NFS PVCs
only the 1st PVR is updated with status (completed)
The next PVR remains without any status, which eventually causes the backup to remain in progress until it fails.
It is observed that both PVs + the PV data were restored, even though its PVR has no status and the restore file.
please go through the reproduction steps carefully
Version-Release number of selected component (if applicable):
How reproducible:
100%
Steps to Reproduce:
1. Deploy external NFS with the following provisioner:
kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: managed-nfs-storage uid: f3a53889-050c-4179-962d-bee510b01b74 resourceVersion: '226271' creationTimestamp: '2023-11-21T12:30:09Z' annotations: storageclass.kubernetes.io/is-default-class: 'true' provisioner: k8s-sigs.io/nfs-subdir-external-provisioner parameters: onDelete: delete pathPattern: '${.PVC.namespace}/${.PVC.annotations.nfs.io/storage-path}' reclaimPolicy: Delete volumeBindingMode: Immediate
2. set the SC as default SC and deploy this busy box application
$oc new-project busybox-2pvc-on-nfs
apiVersion: v1 kind: PersistentVolumeClaim metadata: name: busybox-pvc1 spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: busybox-pvc2 spec: accessModes: - ReadWriteOnce resources: requests: storage: 1Gi --- apiVersion: v1 kind: Pod metadata: name: busybox spec: containers: - name: busybox image: busybox command: ["/bin/sh", "-ec", "while [ true ]; do sleep 3600; done"] volumeMounts: - name: busybox-volume1 mountPath: /mnt/volume1 - name: busybox-volume2 mountPath: /mnt/volume2 volumes: - name: busybox-volume1 persistentVolumeClaim: claimName: busybox-pvc1 - name: busybox-volume2 persistentVolumeClaim: claimName: busybox-pvc2
3. touch a file on both volumes
$ oc exec busybox -- touch /mnt/volume1/test-a Defaulted container "busybox" out of: busybox, restore-wait (init) $ oc exec busybox -- touch /mnt/volume2/test-b Defaulted container "busybox" out of: busybox, restore-wait (init)
4. run backup fs w/ kopia as upload type
dpa
apiVersion: oadp.openshift.io/v1alpha1 kind: DataProtectionApplication metadata: creationTimestamp: '2023-11-21T10:33:22Z' generation: 1 ... name: ts-dpa namespace: openshift-adp resourceVersion: '96934' uid: 7e2991fb-9f4c-4321-8a51-3966cfe3cd2a spec: backupLocations: - velero: config: resourceGroup: oadp-62200-2v2qg-rg storageAccount: oadpbucket247692 storageAccountKeyEnvVar: AZURE_STORAGE_ACCOUNT_ACCESS_KEY subscriptionId: 53b8f551-f0fc-4bea-8cba-6d1fefd54c8a credential: key: cloud name: cloud-credentials default: true objectStorage: bucket: oadpbucket247692 prefix: velero-e2e-64c4680c-8859-11ee-89b6-f6e178f429ce provider: azure configuration: nodeAgent: enable: true podConfig: resourceAllocations: requests: cpu: 100m memory: 64Mi uploaderType: kopia velero: defaultPlugins: - openshift - azure - kubevirt - csi podConfig: resourceAllocations: requests: cpu: 100m memory: 64Mi podDnsConfig: {} snapshotLocations: [] status: conditions: - lastTransitionTime: '2023-11-21T10:33:22Z' message: Reconcile complete reason: Complete status: 'True' type: Reconciled
backup
apiVersion: velero.io/v1 kind: Backup metadata: ... name: 2pvconnfstest uid: 823135fd-932a-4bbb-88ca-0cec546c4042 creationTimestamp: '2023-11-21T14:28:46Z' generation: 6 ... namespace: openshift-adp labels: velero.io/storage-location: ts-dpa-1 spec: csiSnapshotTimeout: 10m0s defaultVolumesToFsBackup: true includedNamespaces: - busybox-2pvc-on-nfs itemOperationTimeout: 4h0m0s snapshotMoveData: false storageLocation: ts-dpa-1 ttl: 720h0m0s status: completionTimestamp: '2023-11-21T14:29:58Z' expiration: '2023-12-21T14:28:46Z' formatVersion: 1.1.0 phase: Completed progress: itemsBackedUp: 41 totalItems: 41 startTimestamp: '2023-11-21T14:28:46Z' version: 1
5. Remove the application
$ oc delete ns/busybox-2pvc-on-nfs
6. Restore
apiVersion: velero.io/v1 kind: Restore metadata: creationTimestamp: '2023-11-21T14:32:42Z' finalizers: - restores.velero.io/external-resources-finalizer generation: 3 managedFields: - apiVersion: velero.io/v1 fieldsType: FieldsV1 fieldsV1: 'f:spec': .: {} 'f:backupName': {} manager: Mozilla operation: Update time: '2023-11-21T14:32:42Z' - apiVersion: velero.io/v1 fieldsType: FieldsV1 fieldsV1: 'f:metadata': 'f:finalizers': .: {} 'v:"restores.velero.io/external-resources-finalizer"': {} 'f:spec': 'f:excludedResources': {} 'f:itemOperationTimeout': {} 'f:status': .: {} 'f:phase': {} 'f:progress': .: {} 'f:itemsRestored': {} 'f:totalItems': {} 'f:startTimestamp': {} manager: velero-server operation: Update time: '2023-11-21T14:33:50Z' name: 2pvconnfstest namespace: openshift-adp resourceVersion: '230628' uid: 6c21ed76-7db5-4198-a33b-82d07b479eb8 spec: backupName: 2pvconnfstest excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io - csinodes.storage.k8s.io - volumeattachments.storage.k8s.io - backuprepositories.velero.io itemOperationTimeout: 4h0m0s status: phase: InProgress progress: itemsRestored: 24 totalItems: 24 startTimestamp: '2023-11-21T14:33:48Z'
* as can be seen here the restore is still in InProgress
7. Allot the 1st PVR to complete
apiVersion: velero.io/v1 kind: PodVolumeRestore metadata: generateName: 2pvconnfstest- resourceVersion: '230699' name: 2pvconnfstest-8d9sz uid: 6507ef99-5f2f-4f9f-ba2f-09a64d4f1591 creationTimestamp: '2023-11-21T14:33:50Z' generation: 3 managedFields: - apiVersion: velero.io/v1 fieldsType: FieldsV1 fieldsV1: 'f:metadata': 'f:generateName': {} 'f:labels': .: {} 'f:velero.io/pod-uid': {} 'f:velero.io/pvc-uid': {} 'f:velero.io/restore-name': {} 'f:velero.io/restore-uid': {} 'f:ownerReferences': .: {} 'k:{"uid":"6c21ed76-7db5-4198-a33b-82d07b479eb8"}': {} 'f:spec': .: {} 'f:backupStorageLocation': {} 'f:pod': {} 'f:repoIdentifier': {} 'f:snapshotID': {} 'f:sourceNamespace': {} 'f:uploaderType': {} 'f:volume': {} 'f:status': .: {} 'f:progress': {} manager: velero-server operation: Update time: '2023-11-21T14:33:50Z' - apiVersion: velero.io/v1 fieldsType: FieldsV1 fieldsV1: 'f:status': 'f:completionTimestamp': {} 'f:phase': {} 'f:startTimestamp': {} manager: node-agent-server operation: Update time: '2023-11-21T14:33:58Z' namespace: openshift-adp ownerReferences: - apiVersion: velero.io/v1 controller: true kind: Restore name: 2pvconnfstest uid: 6c21ed76-7db5-4198-a33b-82d07b479eb8 labels: velero.io/pod-uid: e4580b3d-26bf-4593-988e-01e7f952af3a velero.io/pvc-uid: d384defc-dc5b-4724-b9b4-b9a1c2f6f28f velero.io/restore-name: 2pvconnfstest velero.io/restore-uid: 6c21ed76-7db5-4198-a33b-82d07b479eb8 spec: backupStorageLocation: ts-dpa-1 pod: kind: Pod name: busybox namespace: 2pvconnfstest uid: e4580b3d-26bf-4593-988e-01e7f952af3a repoIdentifier: >- azure:oadpbucket247692:/velero-e2e-64c4680c-8859-11ee-89b6-f6e178f429ce/restic/2pvconnfstest snapshotID: 16d062ae6323fcecf89230f49ea12ee0 sourceNamespace: 2pvconnfstest uploaderType: kopia volume: busybox-volume2 status: completionTimestamp: '2023-11-21T14:33:58Z' phase: Completed progress: {} startTimestamp: '2023-11-21T14:33:56Z'
8. Validate the data on the PVS - Note the data **was restored** in both PVs
$ oc exec busybox -- stat /mnt/volume1/test-a
Defaulted container "busybox" out of: busybox, restore-wait (init)
File: /mnt/volume1/test-a
Size: 0 Blocks: 0 IO Block: 1048576 regular empty file
Device: 1000d0h/1048784d Inode: 859508 Links: 1
Access: (0644/-rw-r--r--) Uid: (1000780000/1000780000) Gid: ( 0/ root)
Access: 2023-11-21 14:51:21.436942041 +0000
Modify: 2023-11-21 14:51:21.436942041 +0000
Change: 2023-11-21 14:51:21.436942041 +0000
Actual results:
the 2nd PVC remains without status
apiVersion: velero.io/v1 kind: PodVolumeRestore metadata: generateName: 2pvconnfstest- resourceVersion: '230636' name: 2pvconnfstest-t9sbw uid: e3bc882e-2f35-4830-81de-538895814925 creationTimestamp: '2023-11-21T14:33:50Z' generation: 1 ... ownerReferences: - apiVersion: velero.io/v1 controller: true kind: Restore name: 2pvconnfstest uid: 6c21ed76-7db5-4198-a33b-82d07b479eb8 labels: velero.io/pod-uid: e4580b3d-26bf-4593-988e-01e7f952af3a velero.io/pvc-uid: e00bbcfd-845c-4aa5-8233-3e7e015c1fa6 velero.io/restore-name: 2pvconnfstest velero.io/restore-uid: 6c21ed76-7db5-4198-a33b-82d07b479eb8 spec: backupStorageLocation: ts-dpa-1 pod: kind: Pod name: busybox namespace: 2pvconnfstest uid: e4580b3d-26bf-4593-988e-01e7f952af3a repoIdentifier: >- azure:oadpbucket247692:/velero-e2e-64c4680c-8859-11ee-89b6-f6e178f429ce/restic/2pvconnfstest snapshotID: 1238f75cb4200312b7e545d716ba56b9 sourceNamespace: 2pvconnfstest uploaderType: kopia volume: busybox-volume1 status: progress: {}
Expected results:
All PVR should be updated with proper status and the restore should be completed successfully
Additional info:
node agen logs, the only indication for the 1st PVR, nothing after that
adp/nfs-sql volume=mysql-data1 time="2023-11-21T14:13:06Z" level=info msg="Founding existing repo" backupLocation=ts-dpa-1 logSource="/remote-source/velero/app/pkg/repository/ensurer.go:86" repositoryType=kopia volumeNamespace=nfs-sql time="2023-11-21T14:13:07Z" level=warning msg="active indexes [xn0_08cfddd7ec4c011b0f75697d7aba07a0-sa3db3ee5395a61df122-c1 xn0_363df94abc57574078e268d17f842400-s7f0fe5a47db017dd122-c1 xn0_72f43384ee4cbb6a04eb5a7ede99b3e2-s4abb5bdc13e6e275122-c1 xn0_9585bd9d2b89f6b4b7fe1a83e757fe5a-s65e1cd915b3daf0d122-c1 xn0_e83e9b8845574cfde74f0c2cbbc491fe-s69cac49b849a8d33122-c1] deletion watermark 0001-01-01 00:00:00 +0000 UTC" PodVolumeRestore=openshift-adp/nfs-sql-d2x4g controller=PodVolumeRestore logModule=kopia/kopia/format logSource="/remote-source/velero/app/pkg/kopia/kopia_log.go:101" pod=nfs-sql/mysql-68d84d7c89-gmh88 restore=openshift-adp/nfs-sql sublevel=error time="2023-11-21T14:13:07Z" level=info msg="Opening backup repo" PodVolumeRestore=openshift-adp/nfs-sql-d2x4g controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/uploader/provider/kopia.go:78" pod=nfs-sql/mysql-68d84d7c89-gmh88 repoUID=9547a4aa-8482-4bf5-ad2a-26b3c98c4ce5 restore=openshift-adp/nfs-sql time="2023-11-21T14:13:07Z" level=warning msg="active indexes [xn0_08cfddd7ec4c011b0f75697d7aba07a0-sa3db3ee5395a61df122-c1 xn0_363df94abc57574078e268d17f842400-s7f0fe5a47db017dd122-c1 xn0_72f43384ee4cbb6a04eb5a7ede99b3e2-s4abb5bdc13e6e275122-c1 xn0_9585bd9d2b89f6b4b7fe1a83e757fe5a-s65e1cd915b3daf0d122-c1 xn0_e83e9b8845574cfde74f0c2cbbc491fe-s69cac49b849a8d33122-c1] deletion watermark 0001-01-01 00:00:00 +0000 UTC" PodVolumeRestore=openshift-adp/nfs-sql-d2x4g controller=PodVolumeRestore logModule=kopia/kopia/format logSource="/remote-source/velero/app/pkg/kopia/kopia_log.go:101" pod=nfs-sql/mysql-68d84d7c89-gmh88 restore=openshift-adp/nfs-sql sublevel=error time="2023-11-21T14:13:07Z" level=info msg="FileSystemBR is initialized" PodVolumeRestore=openshift-adp/nfs-sql-d2x4g bsl=ts-dpa-1 controller=PodVolumeRestore jobName=nfs-sql-d2x4g logSource="/remote-source/velero/app/pkg/datapath/file_system.go:110" pod=nfs-sql/mysql-68d84d7c89-gmh88 repository=kopia restore=openshift-adp/nfs-sql source namespace=nfs-sql uploader=kopia time="2023-11-21T14:13:07Z" level=info msg="Async fs restore data path started" PodVolumeRestore=openshift-adp/nfs-sql-d2x4g controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/controller/pod_volume_restore_controller.go:154" path="/host_pods/f9e882ba-1064-457d-b71a-466637d31561/volumes/kubernetes.io~nfs/pvc-66fefb47-811d-49cf-a9f5-11d205c99c08" pod=nfs-sql/mysql-68d84d7c89-gmh88 restore=openshift-adp/nfs-sql time="2023-11-21T14:13:07Z" level=info msg="Starting restore" PodVolumeRestore=openshift-adp/nfs-sql-d2x4g controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/uploader/provider/kopia.go:219" pod=nfs-sql/mysql-68d84d7c89-gmh88 restore=openshift-adp/nfs-sql snapshotID=0aeb9080fbe7e3b6b5de76f795179d1b volumePath="/host_pods/f9e882ba-1064-457d-b71a-466637d31561/volumes/kubernetes.io~nfs/pvc-66fefb47-811d-49cf-a9f5-11d205c99c08" time="2023-11-21T14:13:07Z" level=info msg="Restore starting" PodVolumeRestore=openshift-adp/nfs-sql-4pljn controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/controller/pod_volume_restore_controller.go:113" pod=nfs-sql/mysql-68d84d7c89-gmh88 restore=openshift-adp/nfs-sql time="2023-11-21T14:13:07Z" level=info msg="Start to restore..." PodVolumeRestore=openshift-adp/nfs-sql-d2x4g controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/uploader/kopia/snapshot.go:359" pod=nfs-sql/mysql-68d84d7c89-gmh88 restore=openshift-adp/nfs-sql snapshotID=0aeb9080fbe7e3b6b5de76f795179d1b volumePath="/host_pods/f9e882ba-1064-457d-b71a-466637d31561/volumes/kubernetes.io~nfs/pvc-66fefb47-811d-49cf-a9f5-11d205c99c08" time="2023-11-21T14:13:07Z" level=info msg="Restore from snapshot 0aeb9080fbe7e3b6b5de76f795179d1b, description Kopia Uploader, created time 2023-11-21 14:07:16.903349898 +0000 UTC, tags map[backup:nfs-sql backup-uid:79a0ec27-9761-4ebd-9eca-893cf6fe9fa0 ns:nfs-sql pod:mysql-68d84d7c89-gmh88 pod-uid:980365d1-3ed9-4e78-8993-9c47b9dd81d2 pvc-uid:a840f233-2662-4384-abe3-0bd614cef0c1 snapshot-requester:pod-volume-backup-restore snapshot-uploader:kopia volume:mysql-data1]" PodVolumeRestore=openshift-adp/nfs-sql-d2x4g controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/uploader/kopia/snapshot.go:368" pod=nfs-sql/mysql-68d84d7c89-gmh88 restore=openshift-adp/nfs-sql snapshotID=0aeb9080fbe7e3b6b5de76f795179d1b volumePath="/host_pods/f9e882ba-1064-457d-b71a-466637d31561/volumes/kubernetes.io~nfs/pvc-66fefb47-811d-49cf-a9f5-11d205c99c08" time="2023-11-21T14:13:23Z" level=info msg="Kopia restore finished, restore size 107855068, file count 107" PodVolumeRestore=openshift-adp/nfs-sql-d2x4g controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/uploader/provider/kopia.go:248" pod=nfs-sql/mysql-68d84d7c89-gmh88 restore=openshift-adp/nfs-sql snapshotID=0aeb9080fbe7e3b6b5de76f795179d1b volumePath="/host_pods/f9e882ba-1064-457d-b71a-466637d31561/volumes/kubernetes.io~nfs/pvc-66fefb47-811d-49cf-a9f5-11d205c99c08" time="2023-11-21T14:13:23Z" level=info msg="Async fs restore data path completed" PVR=nfs-sql-d2x4g controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/controller/pod_volume_restore_controller.go:266" pvr=nfs-sql-d2x4g time="2023-11-21T14:13:23Z" level=info msg="Action finished" PodVolumeRestore=openshift-adp/nfs-sql-d2x4g controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/uploader/provider/kopia.go:91" pod=nfs-sql/mysql-68d84d7c89-gmh88 restore=openshift-adp/nfs-sql time="2023-11-21T14:13:23Z" level=info msg="Restore completed" controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/controller/pod_volume_restore_controller.go:318" pvr=nfs-sql-d2x4g time="2023-11-21T14:13:23Z" level=info msg="FileSystemBR is closed" PodVolumeRestore=openshift-adp/nfs-sql-d2x4g controller=PodVolumeRestore logSource="/remote-source/velero/app/pkg/datapath/file_system.go:129" pod=nfs-sql/mysql-68d84d7c89-gmh88 restore=openshift-adp/nfs-sql user=nfs-sql-d2x4g