-
Bug
-
Resolution: Done
-
Blocker
-
OADP 1.1.0
-
False
-
-
False
-
oadp-operator-bundle-container-1.1.0-49
-
ToDo
-
0
-
0
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
Proposed
-
Yes
Description of problem: volsync-dst-vsr pod completes although not all items where restored in the namespace .
[mperetz@mperetz oadp-e2e-qe]$ oc get pods -n openshift-adp NAME READY STATUS RESTARTS AGE openshift-adp-controller-manager-c554d65f9-lhgj8 1/1 Running 0 87m restic-248ss 1/1 Running 0 41m restic-q9v5v 1/1 Running 0 41m restic-vkvbt 1/1 Running 0 41m velero-5c6fcff564-49fph 1/1 Running 0 26m volsync-dst-vsr-mysql-rep-dest-c68l7 0/1 Completed 0 11m volume-snapshot-mover-64cdcf4b97-djl25 1/1 Running 0 41m
logs:
[mperetz@mperetz oadp-e2e-qe]$ oc logs volsync-dst-vsr-mysql-rep-dest-c68l7 -n openshift-adp Starting container VolSync restic container version: ACM-0.4.1-e6dde1b restore Testing mandatory env variables === Starting restore === /data / Selected restic snapshot with id: a150ca03 restoring <Snapshot a150ca03 of [/data] at 2022-07-29 15:05:20.610304966 +0000 UTC by root@volsync> to . / === Done ===
Eventually restore times out with the following error:
[mperetz@mperetz oadp-e2e-qe]$ velero restore logs mysql-87c9b19f-0f48-11ed-946b-902e163f806c -n openshift-adp | grep error time="2022-07-29T14:32:16Z" level=error msg="Timed out awaiting reconciliation of volumesnapshotrestore vsr-mysql" cmd=/plugins/velero-plugin-for-csi logSource="/remote-source/app/internal/util/util.go:392" pluginName=velero-plugin-for-csi restore=openshift-adp/mysql-87c9b19f-0f48-11ed-946b-902e163f806c time="2022-07-29T14:32:19Z" level=error msg="Namespace mysql-persistent, resource restore error: error preparing volumesnapshotbackups.datamover.oadp.openshift.io/mysql-persistent/vsb-velero-mysql-l6qtf: rpc error: code = Unknown desc = timed out waiting for the condition" logSource="pkg/controller/restore_controller.go:504" restore=openshift-adp/mysql-87c9b19f-0f48-11ed-946b-902e163f806c
Version-Release number of selected component (if applicable):
downstream build 1.1.0-45
currently checked only on OCP 4.11
How reproducible:
Steps to Reproduce:
1. Install Volsync from latest stable channel
cat <<EOF | oc create -f -
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
generateName: oadp-
name: oadp-operator
namespace: openshift-operators
spec:
channel: stable
installPlanApproval: Automatic
name: volsync-product
source: prestage-operators
sourceNamespace: openshift-marketplace
EOF
2. Create VSC:
apiVersion: snapshot.storage.k8s.io/v1
deletionPolicy: Retain
driver: ebs.csi.aws.com
kind: VolumeSnapshotClass
metadata:
annotations:
snapshot.storage.kubernetes.io/is-default-class: "true"
labels:
velero.io/csi-volumesnapshot-class: "true"
name: example-snapclass
2. create restic-secret with the default name dm-credential:
apiVersion: v1
data:
AWS_ACCESS_KEY_ID: XXXXXXX
AWS_SECRET_ACCESS_KEY: XXXXXXXXXXXXXXX
RESTIC_PASSWORD: my-secure-restic-password
RESTIC_REPOSITORY: s3:s3.amazonaws.com/oadpbucket119606
kind: Secret
metadata:
name: dm-credential
namespace: openshift-adp
type: Opaque
3. Set DPA CR:
apiVersion: v1
items:
- apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
creationTimestamp: "2022-07-29T14:42:53Z"
generation: 1
name: ts-dpa
namespace: openshift-adp
resourceVersion: "110749"
uid: deadc7d8-261e-4ea3-b8e5-86330e005ee3
spec:
backupLocations:
- velero:
config:
region: us-east-2
credential:
key: cloud
name: cloud-credentials
default: true
objectStorage:
bucket: oadpbucket125675
prefix: velero-e2e-b4587c4f-0f4c-11ed-970c-902e163f806c
provider: aws
configuration:
restic:
enable: true
podConfig:
resourceAllocations: {}
velero:
defaultPlugins:
- openshift
- aws
- kubevirt
- csi
features:
dataMover:
enable: true
podDnsConfig: {}
snapshotLocations: []
status:
conditions:
- lastTransitionTime: "2022-07-29T14:42:53Z"
message: Reconcile complete
reason: Complete
status: "True"
type: Reconciled
kind: List
metadata:
resourceVersion: ""
selfLink: ""
4. Create backup of an application (mysql in my case). Make sure the status of the VSB and backup is Completed.
apiVersion: velero.io/v1
kind: Backup
metadata:
annotations:
velero.io/source-cluster-k8s-gitversion: v1.24.0+9546431
velero.io/source-cluster-k8s-major-version: "1"
velero.io/source-cluster-k8s-minor-version: "24"
creationTimestamp: "2022-07-29T14:56:53Z"
generation: 5
labels:
velero.io/storage-location: ts-dpa-1
name: backup11
namespace: openshift-adp
resourceVersion: "168532"
uid: b1a7923c-3806-4c11-a765-6e588bc5042c
spec:
defaultVolumesToRestic: false
hooks: {}
includedNamespaces:
- mysql-persistent
metadata: {}
storageLocation: ts-dpa-1
ttl: 720h0m0s
status:
completionTimestamp: "2022-07-29T15:06:10Z"
csiVolumeSnapshotsAttempted: 1
csiVolumeSnapshotsCompleted: 1
expiration: "2022-08-28T14:58:49Z"
formatVersion: 1.1.0
phase: Completed
progress:
itemsBackedUp: 54
totalItems: 54
startTimestamp: "2022-07-29T14:58:53Z"
version: 1
5. Delete the app namespace
6. Create restore:
apiVersion: velero.io/v1
kind: Restore
metadata:
name: backup11-20220729151343
namespace: openshift-adp
spec:
backupName: backup11
excludedResources:
- nodes
- events
- events.events.k8s.io
- backups.velero.io
- restores.velero.io
- resticrepositories.velero.io
hooks: {}
includedNamespaces:
- '*'
restorePVs: true
Actual results:
volumesnapshotcontent fails with this error:
[mperetz@mperetz oadp-e2e-qe]$ oc get volumesnapshotcontent velero-velero-mysql-5q4v6-rkp9h -o yaml apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotContent metadata: creationTimestamp: "2022-07-29T15:24:36Z" finalizers: - snapshot.storage.kubernetes.io/volumesnapshotcontent-bound-protection generateName: velero-velero-mysql-5q4v6- generation: 2 labels: velero.io/restore-name: backup11-20220729151343 name: velero-velero-mysql-5q4v6-rkp9h resourceVersion: "240517" uid: 1edb74f3-c3ce-4e6c-8719-639f082cbe7f spec: deletionPolicy: Retain driver: ebs.csi.aws.com source: snapshotHandle: "" volumeSnapshotClassName: example-snapclass volumeSnapshotRef: kind: VolumeSnapshot name: velero-mysql-5q4v6 namespace: mysql-persistent uid: 4d91f144-44f2-456e-a764-aeb2e12fc36f status: error: message: 'Failed to check and update snapshot content: failed to list snapshot for content velero-velero-mysql-5q4v6-rkp9h: "rpc error: code = Internal desc = Could not list snapshots: InvalidParameterValue: Value ( 0 ) for parameter maxResults is invalid. Expecting a value greater than 5.\n\tstatus code: 400, request id: 6c742d66-ba21-4524-a920-35d7068b85b7"' time: "2022-07-29T15:36:19Z" readyToUse: false
[mperetz@mperetz oadp-e2e-qe]$ oc get pods -n mysql-persistent NAME READY STATUS RESTARTS AGE mysql-65988b478c-kvrsn 0/1 Pending 0 31m [mperetz@mperetz oadp-e2e-qe]$ oc get pvc -n mysql-persistent NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mysql Pending gp2-csi 18m [mperetz@mperetz oadp-e2e-qe]$
[mperetz@mperetz oadp-e2e-qe]$ velero restore get -n openshift-adp -o yaml apiVersion: velero.io/v1 kind: Restore metadata: creationTimestamp: "2022-07-29T15:13:43Z" generation: 9 managedFields: - apiVersion: velero.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:backupName: {} f:hooks: {} f:includedNamespaces: {} f:restorePVs: {} f:status: {} manager: velero operation: Update time: "2022-07-29T15:13:43Z" - apiVersion: velero.io/v1 fieldsType: FieldsV1 fieldsV1: f:spec: f:excludedResources: {} f:status: f:completionTimestamp: {} f:errors: {} f:phase: {} f:progress: .: {} f:itemsRestored: {} f:totalItems: {} f:startTimestamp: {} f:warnings: {} manager: velero-server operation: Update time: "2022-07-29T15:24:40Z" name: backup11-20220729151343 namespace: openshift-adp resourceVersion: "208936" uid: 85163de2-e9a8-450c-b9da-432d2593d21a spec: backupName: backup11 excludedResources: - nodes - events - events.events.k8s.io - backups.velero.io - restores.velero.io - resticrepositories.velero.io hooks: {} includedNamespaces: - '*' restorePVs: true status: completionTimestamp: "2022-07-29T15:24:40Z" errors: 2 phase: PartiallyFailed progress: itemsRestored: 36 totalItems: 36 startTimestamp: "2022-07-29T15:14:03Z" warnings: 8
VSR:
[mperetz@mperetz oadp-e2e-qe]$ oc get vsr -A -o yaml apiVersion: v1 items: - apiVersion: datamover.oadp.openshift.io/v1alpha1 kind: VolumeSnapshotRestore metadata: creationTimestamp: "2022-07-29T15:14:30Z" generation: 1 labels: velero.io/restore-name: backup11-20220729151343 name: vsr-mysql namespace: mysql-persistent resourceVersion: "186253" uid: d78c1cc9-2fb2-46aa-b21b-049d236eca07 spec: protectedNamespace: openshift-adp resticSecretRef: name: ts-dpa-1-volsync-restic volumeSnapshotMoverBackupRef: resticrepository: s3:s3.amazonaws.com/oadpbucket125675/openshift-adp/snapcontent-a82c0c9c-cefe-4dbe-b148-df9aa9d3fb3b-pvc sourcePVCData: name: mysql size: 2Gi storageClassName: gp2-csi volumeSnapshotClassName: example-snapclass kind: List metadata: resourceVersion: "" selfLink: ""
ReplicationDestination:
[mperetz@mperetz oadp-e2e-qe]$ oc get replicationdestination -A -o yaml apiVersion: v1 items: - apiVersion: volsync.backube/v1alpha1 kind: ReplicationDestination metadata: creationTimestamp: "2022-07-29T15:13:18Z" generation: 1 labels: datamover.oadp.openshift.io/vsr: vsr-mysql name: vsr-mysql-rep-dest namespace: openshift-adp resourceVersion: "183468" uid: 67861e07-8e8f-47d3-8dfa-442c3a05a5d9 spec: restic: accessModes: - ReadWriteOnce capacity: 2Gi copyMethod: Snapshot repository: vsr-mysql-secret storageClassName: gp2-csi volumeSnapshotClassName: gp2-csi trigger: manual: vsr-mysql-trigger status: conditions: - lastTransitionTime: "2022-07-29T15:13:18Z" message: Synchronization in-progress reason: SyncInProgress status: "True" type: Synchronizing - lastTransitionTime: "2022-07-29T15:13:19Z" message: Reconcile complete reason: ReconcileComplete status: "True" type: Reconciled lastSyncStartTime: "2022-07-29T15:13:18Z" kind: List metadata: resourceVersion: "" selfLink: ""
volumesnapshotcontents created on restore:
[mperetz@mperetz oadp-e2e-qe]$ oc get volumesnapshotcontent velero-velero-mysql-5q4v6-gsf58 -o yaml apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotContent metadata: annotations: snapshot.storage.kubernetes.io/volumesnapshot-being-deleted: "yes" creationTimestamp: "2022-07-29T15:46:14Z" finalizers: - snapshot.storage.kubernetes.io/volumesnapshotcontent-bound-protection generateName: velero-velero-mysql-5q4v6- generation: 2 labels: velero.io/restore-name: backup11-20220729154552 name: velero-velero-mysql-5q4v6-gsf58 resourceVersion: "288694" uid: fc3cb8db-702c-4ea0-82f3-b74f736f980a spec: deletionPolicy: Retain driver: ebs.csi.aws.com source: snapshotHandle: "" volumeSnapshotClassName: example-snapclass volumeSnapshotRef: kind: VolumeSnapshot name: velero-mysql-5q4v6 namespace: mysql-persistent uid: 5f49eec8-d82e-4dc3-9655-a55121385df7 status: error: message: 'Failed to check and update snapshot content: failed to list snapshot for content velero-velero-mysql-5q4v6-gsf58: "rpc error: code = Internal desc = Could not list snapshots: InvalidParameterValue: Value ( 0 ) for parameter maxResults is invalid. Expecting a value greater than 5.\n\tstatus code: 400, request id: aabce59e-2547-47a1-b9fb-89b1a8293dfc"' time: "2022-07-29T15:55:31Z" readyToUse: false [mperetz@mperetz oadp-e2e-qe]$ oc get volumesnapshotcontent -o yaml apiVersion: v1 items: - apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotContent metadata: annotations: snapshot.storage.kubernetes.io/volumesnapshot-being-deleted: "yes" creationTimestamp: "2022-07-29T15:46:14Z" finalizers: - snapshot.storage.kubernetes.io/volumesnapshotcontent-bound-protection generateName: velero-velero-mysql-5q4v6- generation: 2 labels: velero.io/restore-name: backup11-20220729154552 name: velero-velero-mysql-5q4v6-gsf58 resourceVersion: "289042" uid: fc3cb8db-702c-4ea0-82f3-b74f736f980a spec: deletionPolicy: Retain driver: ebs.csi.aws.com source: snapshotHandle: "" volumeSnapshotClassName: example-snapclass volumeSnapshotRef: kind: VolumeSnapshot name: velero-mysql-5q4v6 namespace: mysql-persistent uid: 5f49eec8-d82e-4dc3-9655-a55121385df7 status: error: message: 'Failed to check and update snapshot content: failed to list snapshot for content velero-velero-mysql-5q4v6-gsf58: "rpc error: code = Internal desc = Could not list snapshots: InvalidParameterValue: Value ( 0 ) for parameter maxResults is invalid. Expecting a value greater than 5.\n\tstatus code: 400, request id: 9d7f2eec-e99e-46c6-a40e-5b51481aaa52"' time: "2022-07-29T15:55:39Z" readyToUse: false - apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshotContent metadata: annotations: snapshot.storage.kubernetes.io/volumesnapshot-being-deleted: "yes" creationTimestamp: "2022-07-29T15:46:16Z" finalizers: - snapshot.storage.kubernetes.io/volumesnapshotcontent-bound-protection generateName: velero-velero-mysql-tmlk6- generation: 2 labels: velero.io/restore-name: backup11-20220729154552 name: velero-velero-mysql-tmlk6-tl6fd resourceVersion: "289030" uid: 289a9a25-baf5-4db4-9afe-60ca2bd32487 spec: deletionPolicy: Retain driver: ebs.csi.aws.com source: snapshotHandle: "" volumeSnapshotClassName: example-snapclass volumeSnapshotRef: kind: VolumeSnapshot name: velero-mysql-tmlk6 namespace: mysql-persistent uid: fd7e7d52-0237-4a25-8be9-917b10e7d6af status: error: message: 'Failed to check and update snapshot content: failed to list snapshot for content velero-velero-mysql-tmlk6-tl6fd: "rpc error: code = Internal desc = Could not list snapshots: InvalidParameterValue: Value ( 0 ) for parameter maxResults is invalid. Expecting a value greater than 5.\n\tstatus code: 400, request id: ac2e86f0-8f56-4572-b81d-acb4d3db7df2"' time: "2022-07-29T15:55:39Z" readyToUse: false kind: List metadata: resourceVersion: "" selfLink: ""
[mperetz@mperetz oadp-e2e-qe]$ oc get volumesnapshot -A -o yaml apiVersion: v1 items: - apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: annotations: velero.io/csi-driver-name: ebs.csi.aws.com velero.io/csi-volumesnapshot-handle: snap-03f2d51f00ed1cac4 velero.io/csi-vsc-deletion-policy: Retain velero.io/vsi-volumesnapshot-restore-size: 2Gi creationTimestamp: "2022-07-29T15:57:12Z" finalizers: - snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection generation: 1 labels: velero.io/backup-name: backup11 velero.io/restore-name: backup11-20220729154647 name: velero-mysql-5q4v6 namespace: mysql-persistent resourceVersion: "293317" uid: 70122683-124b-4c1a-ac65-096b0796f41f spec: source: volumeSnapshotContentName: velero-velero-mysql-5q4v6-qmbp8 volumeSnapshotClassName: example-snapclass status: boundVolumeSnapshotContentName: velero-velero-mysql-5q4v6-qmbp8 error: message: 'Failed to check and update snapshot content: failed to list snapshot for content velero-velero-mysql-5q4v6-qmbp8: "rpc error: code = Internal desc = Could not list snapshots: InvalidParameterValue: Value ( 0 ) for parameter maxResults is invalid. Expecting a value greater than 5.\n\tstatus code: 400, request id: e4639c97-5557-448e-952f-c69931a9b31e"' time: "2022-07-29T15:57:17Z" readyToUse: false - apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: annotations: velero.io/csi-driver-name: ebs.csi.aws.com velero.io/csi-volumesnapshot-handle: snap-00f64b33c28e4c73a velero.io/csi-vsc-deletion-policy: Retain velero.io/vsi-volumesnapshot-restore-size: 2Gi creationTimestamp: "2022-07-29T15:57:15Z" finalizers: - snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection generation: 1 labels: velero.io/backup-name: backup11 velero.io/restore-name: backup11-20220729154647 name: velero-mysql-tmlk6 namespace: mysql-persistent resourceVersion: "293288" uid: 57a52438-916b-44de-9d28-8617a9cede13 spec: source: volumeSnapshotContentName: velero-velero-mysql-tmlk6-rvbx2 volumeSnapshotClassName: example-snapclass status: boundVolumeSnapshotContentName: velero-velero-mysql-tmlk6-rvbx2 error: message: 'Failed to check and update snapshot content: failed to list snapshot for content velero-velero-mysql-tmlk6-rvbx2: "rpc error: code = Internal desc = Could not list snapshots: InvalidParameterValue: Value ( 0 ) for parameter maxResults is invalid. Expecting a value greater than 5.\n\tstatus code: 400, request id: c72f7514-3cc5-4c96-9b0a-e2e55d407893"' time: "2022-07-29T15:57:16Z" readyToUse: false - apiVersion: snapshot.storage.k8s.io/v1 kind: VolumeSnapshot metadata: creationTimestamp: "2022-07-29T15:41:23Z" generation: 1 name: volsync-vsr-mysql-rep-dest-dest-20220729151344 namespace: openshift-adp ownerReferences: - apiVersion: volsync.backube/v1alpha1 blockOwnerDeletion: true controller: true kind: ReplicationDestination name: vsr-mysql-rep-dest uid: 67861e07-8e8f-47d3-8dfa-442c3a05a5d9 resourceVersion: "254288" uid: ebecf384-c659-482c-9b0a-1ebcf0e967f9 spec: source: persistentVolumeClaimName: volsync-vsr-mysql-rep-dest-dest volumeSnapshotClassName: gp2-csi status: error: message: Failed to get snapshot class with error volumesnapshotclass.snapshot.storage.k8s.io "gp2-csi" not found time: "2022-07-29T15:41:23Z" kind: List metadata: resourceVersion: "" selfLink: "" [mperetz@mperetz oadp-e2e-qe]$ oc get storageclass NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE gp2 kubernetes.io/aws-ebs Delete WaitForFirstConsumer true 153m gp2-csi (default) ebs.csi.aws.com Delete WaitForFirstConsumer true 153m gp3-csi ebs.csi.aws.com Delete WaitForFirstConsumer true 153m [mperetz@mperetz oadp-e2e-qe]$
Expected results:
Additional info: