-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
Quality / Stability / Reliability
-
3
-
False
-
-
False
-
ToDo
-
-
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
None
Description of problem:
Hi, I am trying to test the resumption of datauploads on node-agent pods restart but the Backup/Restore is getting stuck, sometimes not even getting canceled or resumed after pods restart. Showing the case of Backup here, same issue with restore as well.
Version-Release number of selected component (if applicable):
oadp-dev branch , 1.5.3
How reproducible:
Always
Steps to Reproduce:
1. Deploy any application with multiple PVCs, lets say mysql
2. Perform the datamover backup
3. As soon as the backup is triggered, delete all node-agent pods, so that they can get restarted.
Actual results:
Datauploads get stuck in Accepted state, not moving to In Progress, and Failing after long wait.
Expected results:
DataUploads should be resumed.
Additional info:
oc get dataupload -w NAME STATUS STARTED BYTES DONE TOTAL BYTES STORAGE LOCATION AGE NODE test1-r27p7 Completed 2m30s 104857640 104857640 ts-dpa-1 3m40s oadp-138671-7b84w-worker-c-l9t9v test1-trhwc Accepted ts-dpa-1 3m46s
oc get backup test1 -o yaml
apiVersion: velero.io/v1
kind: Backup
metadata:
annotations:
velero.io/resource-timeout: 10m0s
velero.io/source-cluster-k8s-gitversion: v1.32.10
velero.io/source-cluster-k8s-major-version: "1"
velero.io/source-cluster-k8s-minor-version: "32"
creationTimestamp: "2025-12-09T05:42:01Z"
generation: 6
labels:
velero.io/storage-location: ts-dpa-1
name: test1
namespace: openshift-adp
resourceVersion: "44451"
uid: daf08633-69ab-4122-9e9d-13fc87e7419a
spec:
csiSnapshotTimeout: 10m0s
defaultVolumesToFsBackup: false
excludedClusterScopedResources:
- volumesnapshotcontents.snapshot.storage.k8s.io
excludedNamespaceScopedResources:
- volumesnapshots.snapshot.storage.k8s.io
includedNamespaces:
- mysql
itemOperationTimeout: 1h0m0s
snapshotMoveData: true
storageLocation: ts-dpa-1
ttl: 720h0m0s
volumeGroupSnapshotLabelKey: velero.io/volume-group
status:
backupItemOperationsAttempted: 2
backupItemOperationsCompleted: 1
expiration: "2026-01-08T05:42:01Z"
formatVersion: 1.1.0
hookStatus: {}
phase: WaitingForPluginOperations
progress:
itemsBackedUp: 46
totalItems: 46
startTimestamp: "2025-12-09T05:42:01Z"
version: 1
oc get dataupload NAME STATUS STARTED BYTES DONE TOTAL BYTES STORAGE LOCATION AGE NODE test1-r27p7 Completed 35m 104857640 104857640 ts-dpa-1 36m oadp-138671-7b84w-worker-c-l9t9v test1-trhwc Failed ts-dpa-1 36m
Fails after 36m.
oc get backup test1 -o yaml
apiVersion: velero.io/v1
kind: Backup
metadata:
annotations:
velero.io/resource-timeout: 10m0s
velero.io/source-cluster-k8s-gitversion: v1.32.10
velero.io/source-cluster-k8s-major-version: "1"
velero.io/source-cluster-k8s-minor-version: "32"
creationTimestamp: "2025-12-09T05:42:01Z"
generation: 8
labels:
velero.io/storage-location: ts-dpa-1
name: test1
namespace: openshift-adp
resourceVersion: "51954"
uid: daf08633-69ab-4122-9e9d-13fc87e7419a
spec:
csiSnapshotTimeout: 10m0s
defaultVolumesToFsBackup: false
excludedClusterScopedResources:
- volumesnapshotcontents.snapshot.storage.k8s.io
excludedNamespaceScopedResources:
- volumesnapshots.snapshot.storage.k8s.io
includedNamespaces:
- mysql
itemOperationTimeout: 1h0m0s
snapshotMoveData: true
storageLocation: ts-dpa-1
ttl: 720h0m0s
volumeGroupSnapshotLabelKey: velero.io/volume-group
status:
backupItemOperationsAttempted: 2
backupItemOperationsCompleted: 1
backupItemOperationsFailed: 1
completionTimestamp: "2025-12-09T06:12:25Z"
errors: 1
expiration: "2026-01-08T05:42:01Z"
formatVersion: 1.1.0
hookStatus: {}
phase: PartiallyFailed
progress:
itemsBackedUp: 46
totalItems: 46
startTimestamp: "2025-12-09T05:42:01Z"
version: 1
DPA:
oc get dpa -o yaml
apiVersion: v1
items:
- apiVersion: oadp.openshift.io/v1alpha1
kind: DataProtectionApplication
metadata:
creationTimestamp: "2025-12-09T05:34:30Z"
generation: 3
name: ts-dpa
namespace: openshift-adp
resourceVersion: "176925"
uid: 7830597f-20a5-4394-8911-7dcd772426f0
spec:
backupLocations:
- velero:
credential:
key: cloud
name: cloud-credentials
default: true
objectStorage:
bucket: oadp1386717b84w
prefix: velero
provider: gcp
configuration:
nodeAgent:
enable: true
uploaderType: kopia
velero:
defaultPlugins:
- csi
- gcp
- openshift
disableFsBackup: false
logFormat: text
nonAdmin:
enable: true
status:
conditions:
- lastTransitionTime: "2025-12-09T14:08:03Z"
message: Reconcile complete
reason: Complete
status: "True"
type: Reconciled
kind: List
metadata:
resourceVersion: ""