-
Bug
-
Resolution: Done
-
Blocker
-
OADP 1.1.1
-
False
-
-
False
-
oadp-volume-snapshot-mover-container-1.1.1-22
-
ToDo
-
0
-
0
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
Proposed
-
Yes
Description of problem: effectively from 1.1.1 (not sure exactly in which build it was first introduced), restore started to fail randomly with "ReplicationDestination.volsync.backube xxxx not found" error (where xxx is the name of the replicationdestination CR), although it looks like the ReplicationDestination was created eventually.
Also looks like the CSI volumesnapshot fails with the following error:
'Failed to check and update snapshot content: failed to list snapshot
for content velero-velero-mysql-tz5f4-4hq58: "rpc error: code = Internal desc
= Could not list snapshots: InvalidParameterValue: Value ( 0 ) for parameter
maxResults is invalid. Expecting a value greater than 5.\n\tstatus code: 400,
request id: 655bbfa8-c7e0-4311-900b-092f2e617f8d"'
Please note that the restore can also pass for the same application.
Version-Release number of selected component (if applicable):
1.1.1, build oadp-operator-bundle-container-1.1.1-21
How reproducible: happens a lot, not sure exactly how much.
Steps to Reproduce:
1. Create a backup of a stateful application with datamover for PV backup
2. Make sure backup completes sucessfully - no errors on volmesnapshots nor on VSB (VSB is in "Completed" phase)
3. Delete the application namespace
4. Once the namespace is removed, create a restore of the backup
Actual results:
Restore may fail with the following errors:
VSR:
[mperetz@fedora oadp-e2e-qe]$ oc get vsr -A -o yaml apiVersion: v1 items: - apiVersion: datamover.oadp.openshift.io/v1alpha1 kind: VolumeSnapshotRestore metadata: creationTimestamp: "2022-10-11T10:29:27Z" generateName: vsr- generation: 1 labels: velero.io/persistent-volume-claim-name: mysql velero.io/restore-name: mysql-ad93ad8a-494e-11ed-b0c4-902e163f806c name: vsr-mjp78 namespace: mysql-persistent resourceVersion: "65364" uid: 47455522-fc52-44d1-a14c-f35205a7389e spec: protectedNamespace: openshift-adp resticSecretRef: name: ts-dpa-1-volsync-restic volumeSnapshotMoverBackupRef: resticrepository: s3:s3.amazonaws.com/oadpbucket145568/openshift-adp/snapcontent-4131629e-a252-44ba-8ccf-99553ea06a7d-pvc sourcePVCData: name: mysql size: 2Gi storageClassName: gp2-csi volumeSnapshotClassName: example-snapclass status: conditions: - lastTransitionTime: "2022-10-11T10:29:27Z" message: ReplicationDestination.volsync.backube "vsr-mjp78-rep-dest" not found reason: Error status: "False" type: Reconciled phase: Failed kind: List metadata: resourceVersion: ""
VolumeSnapshot:
[mperetz@fedora oadp-e2e-qe]$ oc get volumesnapshot -A -o yaml apiVersion: v1 items: - apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: annotations: velero.io/csi-driver-name: ebs.csi.aws.com velero.io/csi-volumesnapshot-handle: snap-0e464beebb6c4c180 velero.io/csi-vsc-deletion-policy: Retain velero.io/vsi-volumesnapshot-restore-size: 2Gi creationTimestamp: "2022-10-11T10:29:35Z" finalizers: - snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection generation: 1 labels: velero.io/backup-name: mysql-ad93ad8a-494e-11ed-b0c4-902e163f806c velero.io/restore-name: mysql-ad93ad8a-494e-11ed-b0c4-902e163f806c name: velero-mysql-cwwp5 namespace: mysql-persistent resourceVersion: "67546" uid: 0276f2a6-4d17-4dee-a0c6-1fa8f9b33d76 spec: source: volumeSnapshotContentName: velero-velero-mysql-cwwp5-j4qdt volumeSnapshotClassName: example-snapclass status: boundVolumeSnapshotContentName: velero-velero-mysql-cwwp5-j4qdt error: message: 'Failed to check and update snapshot content: failed to list snapshot for content velero-velero-mysql-cwwp5-j4qdt: "rpc error: code = Internal desc = Could not list snapshots: InvalidParameterValue: Value ( 0 ) for parameter maxResults is invalid. Expecting a value greater than 5.\n\tstatus code: 400, request id: f18b6673-526e-49db-bff3-5635de93d5c7"' time: "2022-10-11T10:31:47Z" readyToUse: false - apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: creationTimestamp: "2022-10-11T10:29:43Z" finalizers: - snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection generation: 1 labels: app.kubernetes.io/created-by: volsync name: volsync-vsr-mjp78-rep-dest-dest-20221011102943 namespace: openshift-adp ownerReferences: - apiVersion: volsync.backube/v1alpha1 blockOwnerDeletion: true controller: true kind: ReplicationDestination name: vsr-mjp78-rep-dest uid: 466d66c3-b544-4460-bca8-9f383f80352d resourceVersion: "66319" uid: 236a5c9a-f9a3-4a9e-a83b-a2e7ed279cb3 spec: source: persistentVolumeClaimName: volsync-vsr-mjp78-rep-dest-dest volumeSnapshotClassName: example-snapclass status: boundVolumeSnapshotContentName: snapcontent-236a5c9a-f9a3-4a9e-a83b-a2e7ed279cb3 creationTime: "2022-10-11T10:29:45Z" readyToUse: true restoreSize: 2Gi kind: List metadata: resourceVersion: ""
Expected results: Restore should pass
Additional info:
- blocks
-
OADP-905 [RedHat QE] Verify Bug OADP-611 - Data mover VSR resources are sometimes created multiple times with multiple PVCs
- Release Pending
-
OADP-906 [IBM QE-P] Verify Bug OADP-611 - Data mover VSR resources are sometimes created multiple times with multiple PVCs
- Release Pending
-
OADP-907 [IBM QE-Z] Verify Bug OADP-611 - Data mover VSR resources are sometimes created multiple times with multiple PVCs
- Release Pending
-
OADP-611 Data mover VSR resources are sometimes created multiple times with multiple PVCs
- Closed
- mentioned on