-
Bug
-
Resolution: Done-Errata
-
Major
-
OADP 1.4.0
-
4
-
False
-
-
False
-
oadp-operator-bundle-container-1.4.1-20
-
ToDo
-
-
-
0
-
0.000
-
Very Likely
-
0
-
None
-
Unset
-
Unknown
-
Yes
We have observed a significant performance degradation in the restore operation using Kopia for large files in our regression tests. Specifically, in Case 2.4.1.9, the restore of a single namespace containing 100 files, each of 10GB, took 0:51:17 on the new version of Kopia compared to 0:13:53 on OADP 1.3.1-54.
This degradation may be related to the size and count of the files being restored. While the new Kopia version shows performance improvements in other cases with smaller file sizes, it appears to struggle with larger files, resulting in a noticeable degradation compared to previous OADP versions 1.3.0-1.3.1.
Steps to Reproduce:
- Use Kopia to back up a single namespace with 100 files, each sized at 10GB.
- Measure the time taken to restore this backup using the new version of Kopia.
- Compare the restore time with the restore time on OADP 1.3.1-54.
Expected Result:
The restore time for large files should be comparable to or better than the restore time on OADP 1.3.1-54.
Actual Result:
The restore time for large files on the new Kopia version is significantly longer than on OADP 1.3.1-54, indicating a performance degradation.
Additional Information:
- Case Reference: 2.4.1.9
- Restore Time on New Kopia: 0:51:17
- Restore Time on OADP 1.3.1-54: 0:13:53
- Observed Degradation: Approximately 269.75%
Notes:
- this cycle was executed with the same OADP version on both clouds 33 & 15 ( reproduce twice )
- I have checked all the relevant logs, CRs, and related objects, and nothing looks suspicious. The only potentially related information was found in the node-agent pod logs with warnings at the error level.
time="2024-07-05T18:16:38Z" level=warning msg="active indexes [xn0_0b51efb539698aecc1c85a27c7ee2f5a-sc3f440f8b765623c12a-c1 xn0_3dc75c368016affc51aedc53f46591ae-s00a02b9b28c1e36912a-c1 xn0_54bbd1c54996c528f3a806fcf18866da-sd37c4c5641f83a6212a-c1 xn0_8f3500db186d5626e13909e844a71cf8-sb2d80086508d9ac212a-c1 xn0_e48aef60a63e17ec66e160852f000312-se910052600473a3e12a-c1] deletion watermark 0001-01-01 00:00:00 +0000 UTC" PodVolumeRestore=openshift-adp/restore-kopia-pvc-util-2-4-1-9-cephrbd-100f-10gb-1001g-vft6s controller=PodVolumeRestore logModule=kopia/kopia/format logSource="/remote-source/velero/app/pkg/kopia/kopia_log.go:101" logger name="[index-blob-manager]" pod=perf-datagen-case3-cephrbd/deploy-perf-datagen-2-4-1-9-1200gi-1-rbd-0-c7bc487ff-9csct restore=openshift-adp/restore-kopia-pvc-util-2-4-1-9-cephrbd-100f-10gb-1001g sublevel=error time="2024-07-05T18:31:38Z" level=warning msg="active indexes [xn0_0b51efb539698aecc1c85a27c7ee2f5a-sc3f440f8b765623c12a-c1 xn0_3dc75c368016affc51aedc53f46591ae-s00a02b9b28c1e36912a-c1 xn0_54bbd1c54996c528f3a806fcf18866da-sd37c4c5641f83a6212a-c1 xn0_8f3500db186d5626e13909e844a71cf8-sb2d80086508d9ac212a-c1 xn0_e48aef60a63e17ec66e160852f000312-se910052600473a3e12a-c1] deletion watermark 0001-01-01 00:00:00 +0000 UTC" PodVolumeRestore=openshift-adp/restore-kopia-pvc-util-2-4-1-9-cephrbd-100f-10gb-1001g-vft6s controller=PodVolumeRestore logModule=kopia/kopia/format logSource="/remote-source/velero/app/pkg/kopia/kopia_log.go:101" logger name="[index-blob-manager]" pod=perf-datagen-case3-cephrbd/deploy-perf-datagen-2-4-1-9-1200gi-1-rbd-0-c7bc487ff-9csct restore=openshift-adp/restore-kopia-pvc-util-2-4-1-9-cephrbd-100f-10gb-1001g sublevel=error time="2024-07-05T18:46:38Z" level=warning msg="active indexes [xn0_0b51efb539698aecc1c85a27c7ee2f5a-sc3f440f8b765623c12a-c1 xn0_3dc75c368016affc51aedc53f46591ae-s00a02b9b28c1e36912a-c1 xn0_54bbd1c54996c528f3a806fcf18866da-sd37c4c5641f83a6212a-c1 xn0_8f3500db186d5626e13909e844a71cf8-sb2d80086508d9ac212a-c1 xn0_e48aef60a63e17ec66e160852f000312-se910052600473a3e12a-c1] deletion watermark 0001-01-01 00:00:00 +0000 UTC" PodVolumeRestore=openshift-adp/restore-kopia-pvc-util-2-4-1-9-cephrbd-100f-10gb-1001g-vft6s controller=PodVolumeRestore logModule=kopia/kopia/format logSource="/remote-source/velero/app/pkg/kopia/kopia_log.go:101" logger name="[index-blob-manager]" pod=perf-datagen-case3-cephrbd/deploy-perf-datagen-2-4-1-9-1200gi-1-rbd-0-c7bc487ff-9csct restore=openshift-adp/restore-kopia-pvc-util-2-4-1-9-cephrbd-100f-10gb-1001g sublevel=error
- Further investigation is needed to determine the root cause of this performance issue and to identify potential optimizations for handling large file restores in the new Kopia version.
OCP : 4.16.0
OADP : 1.4.0-13
ODF : 4.15.4
full logs from both clouds can be found here :
https://drive.google.com/drive/folders/1AXKQHLQ_2fYxwU_tR5UJAZeofL5yQIJ8?usp=sharing
- depends on
-
OADP-4640 Carry Downstream: Override kopia default hash, encryption, splitter algo via env var
- Closed
- links to
-
RHBA-2024:132893 OpenShift API for Data Protection (OADP) 1.4.1 security and bug fix update