-
Sub-task
-
Resolution: Done
-
Undefined
-
None
-
None
-
None
-
False
-
-
False
-
ToDo
-
-
-
0
-
0
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
Description of problem:
While running CSI backup of namespace with1000 pods - backup end with the status "PartiallyFailed ".
Error Message:
main-backup-scheduler-1000pods-every-2hrs-20220928-082124/backup-scheduler-1000pods-every-2hrs-20220928083022/backup-scheduler-1000pods-every-2hrs-20220928083022.log:time="2022-09-28T10:10:02Z" level=error msg="fail to recreate VolumeSnapshotContent snapcontent-7a455d87-5e00-42ba-b54c-3b16ba91df71: fail to retrieve VolumeSnapshotContent snapcontent-7a455d87-5e00-42ba-b54c-3b16ba91df71 info: timed out waiting for the condition" backup=openshift-adp/backup-scheduler-1000pods-every-2hrs-20220928083022 logSource="pkg/controller/backup_controller.go:985".
Also running CSI backup of namespace with 80/90/100 pods - All backups were completed.
Version-Release number of selected component (if applicable):
OCP 4.10.26
OADP 1.1.0-74
How reproducible:
Steps to Reproduce:
1. Create ns with 1000pods
2. Run CSI backup
3. Check backup status
Actual results:
Backup failed with "PartiallyFailed" status
Expected results:
Backup passed with "completed" status
Additional info:
logs:
https://drive.google.com/drive/folders/1VxFgiILR_IlYfHhbZJEhlGuC8mid_-Iw?usp=sharing
Ran a few iterations with 10min timeout - backup completed (Using Private Velero)
upstream issue: https://github.com/vmware-tanzu/velero/issues/5416