-
Bug
-
Resolution: Unresolved
-
Normal
-
None
-
None
-
Quality / Stability / Reliability
-
3
-
False
-
-
False
-
ToDo
-
-
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
None
Upstream Issue: https://github.com/vmware-tanzu/velero/issues/9179
Problem Description:
Two call paths during backup are involving volume policy checks:
- backupItemInternal->executeActions->volumeHelperImpl.ShouldPerformSnapshot->podvolumeutil.GetPodsUsingPVC
- backupItemInternal->takePVSnapshot->volumeHelperImpl.ShouldPerformSnapshot->podvolumeutil.GetPodsUsingPVC
podvolumeutil.GetPodsUsingPVC always lists all the pods and makes a O
iteration.
If there are N PVCs and M pods in the cluster, it makes N duplicated listing of pods and N*M iterations.
When both N and M are larger numbers, the performance is very low.
As monitored in logs of issue #9169, one N*M iteration even took 2s:
time="2025-08-08T04:03:22Z" level=info msg="Executing takePVSnapshot" backup=velero/velero-w9s-everything-daily-20250808040020 logSource="pkg/backup/item_backupper.go:552" name=pvc-e74b7300-a72b-401d-87ae-f4c416a556c9 namespace= resource=persistentvolumes time="2025-08-08T04:03:24Z" level=info msg="skipping snapshot action for pv pvc-e74b7300-a72b-401d-87ae-f4c416a556c9 possibly due to no volume policy setting or snapshotVolumes is false" backup=velero/velero-w9s-everything-daily-20250808040020 logSource="internal/volumehelper/volume_policy_helper.go:136"
Required Action:
We need to refactor the code and improve the performance.
Upstream Status:
- State: Open
- Assignee: shubham-pampattiwar
- Labels: Performance, Needs investigation, area/volume-policy, 1.18-candidate, Reviewed Q3 2025