-
Sub-task
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
4
-
False
-
-
False
-
ToDo
-
-
-
0
-
0.000
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
Description of problem:
Currently datamover backup/restores are partially failing due to the dataupload/download getting cancelled in case of ceph storage class.
We observed that the backups are getting failed more frequently with ceph-RBD storageclass. In case of ceph-fs the restore is failing more frequently.
Attached error below:-
message: 'found a dataupload openshift-adp/backup20-llp79 with expose error: Pod
is unschedulable: 0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims.
preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling...
mark it as cancel'
Version-Release number of selected component (if applicable):
OADP 1.4.0 -6
OCP 4.14 & OCP 4.15
How reproducible:
Always (100% in case of IBM Z platform)
Steps to Reproduce:
1. Create a DPA with CSI and nodeAgent enabled.
2. Deploy a stateful application
3. Trigger dataMover backup
4. In case backup didn't fail, delete app namespace and trigger restore.
Actual results:
DataUpload/Download is getting cancelled with below error causing backup/restore to partially fail.
message: 'found a dataupload openshift-adp/backup20-llp79 with expose error: Pod
is unschedulable: 0/6 nodes are available: pod has unbound immediate PersistentVolumeClaims.
preemption: 0/6 nodes are available: 6 Preemption is not helpful for scheduling...
mark it as cancel'
Expected results:
Dataupload/datadownload should be successful.
Additional info: