-
Bug
-
Resolution: Duplicate
-
Normal
-
None
-
OADP 1.3.1
-
None
-
False
-
-
False
-
ToDo
-
-
-
0
-
0
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
No
Description of problem:
Triggered a CSI backup with setting csiSnapshotTimeout field as 2min. I noticed backup was in WaitingForPluginOperationsPartiallyFailed for almost 10+ minutes even though the specified CSI timeout was 2 minutes. This issue only happens when VolumeSnapshotContent has this error.
error:
message: 'Failed to check and update snapshot content: failed to remove VolumeSnapshotBeingCreated annotation on the content snapcontent-b87c0704-ca99-4b89-a249-320b6f70c54f: "snapshot controller failed to update snapcontent-b87c0704-ca99-4b89-a249-320b6f70c54f on API server: Operation cannot be fulfilled on volumesnapshotcontents.snapshot.storage.k8s.io \"snapcontent-b87c0704-ca99-4b89-a249-320b6f70c54f\": the object has been modified; please apply your changes to the latest version and try again"'
Attached start and completion timestamp below:-
startTimestamp: "2024-01-30T06:01:44Z" completionTimestamp: "2024-01-30T06:16:43Z"
Version-Release number of selected component (if applicable):
OADP 1.3.1
How reproducible:
Intermittent
Steps to Reproduce:
1. Deploy a stateful application which has at least 1 PVC.
2. Trigger CSI backup
Actual results:
Backup took 10+ minutes to move from WaitingForPluginOperationsPartiallyFailed to PartiallyFailed status.
$ oc get backup test-backup1 -o yaml apiVersion: velero.io/v1 kind: Backup metadata: annotations: velero.io/resource-timeout: 10m0s velero.io/source-cluster-k8s-gitversion: v1.26.13+77e61a2 velero.io/source-cluster-k8s-major-version: "1" velero.io/source-cluster-k8s-minor-version: "26" creationTimestamp: "2024-01-30T06:01:44Z" generation: 8 labels: velero.io/storage-location: ts-dpa-1 name: test-backup1 namespace: openshift-adp resourceVersion: "56731" uid: a7e3b2c5-e954-47e0-96e5-a006a7251d4b spec: csiSnapshotTimeout: 2m defaultVolumesToFsBackup: false includedNamespaces: - ocp-mysql itemOperationTimeout: 4h0m0s snapshotMoveData: false storageLocation: ts-dpa-1 ttl: 720h0m0s status: backupItemOperationsAttempted: 4 backupItemOperationsCompleted: 3 backupItemOperationsFailed: 1 completionTimestamp: "2024-01-30T06:16:43Z" csiVolumeSnapshotsAttempted: 2 csiVolumeSnapshotsCompleted: 2 errors: 1 expiration: "2024-02-29T06:01:44Z" formatVersion: 1.1.0 phase: PartiallyFailed progress: itemsBackedUp: 66 totalItems: 66 startTimestamp: "2024-01-30T06:01:44Z" version: 1
Expected results:
Backup should only wait for the specified csiSnapshotTimeout.
Additional info:
Attached velero logs below:-
velero-logs