-
Bug
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
False
-
-
False
-
0
-
Phoenix - Content
-
-
-
Sprint 142
-
Moderate
-
None
Description of problem:
Having a product with many repos that each repo is in many CV versions, an attempt to delete a bigger CV version (of say 100 repos) takes a lot of time and memory.
In a customer story behind this, puma worker planning a foreman task consumed 11GB memory.
On my reproducer, I got 6-8GB easily, and the planning took 15 minutes.
With a simple change, the memory consumption can be reduced to 200MB-ish of RAM and to 20ish seconds.
The key "not scalling well" factor is https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/destroy.rb#L72-L77 during deletion of a repository.
Replacing that cycle by a single ActiveRecord query prevents manipulation with individual repository objects and save a lot of time+space.
How reproducible:
100%
Is this issue a regression from an earlier version:
yes (performance regression since ACS feature added)
Steps to Reproduce:
1. Create a product with 200 repos:
hammer product create --organization-id 1 --name ZOO_product for i in $(seq 1 200); do echo "repository create --organization-id 1 --product ZOO_product --name ZOO_repo_${i} --content-type yum --download-policy on_demand --url https://repos.fedorapeople.org/repos/pulp/pulp/demo_repos/zoo/" done | hammer shell
2. Create one CV with 100 of the repos:
hammer content-view create --organization-id 1 --name CV_zoo_manyrepos for i in $(seq 1 100); do echo "content-view add-repository --organization-id 1 --name CV_zoo_manyrepos --product ZOO_product --repository ZOO_repo_${i}" done | hammer shell
3. Create another CV with all the 200 repos (or even with completely disjunct set of repos, like "seq 101 200"):
hammer content-view create --organization-id 1 --name CV_zoo_HUGErepos for i in $(seq 1 200); do echo "content-view add-repository --organization-id 1 --name CV_zoo_HUGErepos --product ZOO_product --repository ZOO_repo_${i}" done | hammer shell
4. Publish some versions of both CVs - the more the better:
for i in $(seq 1 10); do echo "content-view publish --organization-id 1 --name CV_zoo_manyrepos" echo "content-view publish --organization-id 1 --name CV_zoo_HUGErepos" done | hammer shell
5. Restart `foreman` service to see memory usage required to process below request.
6. Delete either CV version, e.g. via WebUI.
7. Monitor memory usage of puma workers, and spot in /var/log/foreman/production.log:
2024-11-11T15:21:53 [I|app|05628762] Started PUT "/katello/api/content_views/107/bulk_delete_versions" for ::1 at 2024-11-11 15:21:53 +0100 2024-11-11T15:21:53 [I|app|05628762] Processing by Katello::Api::V2::ContentViewsController#bulk_delete_versions as JSON 2024-11-11T15:21:53 [I|app|05628762] Parameters: {"bulk_content_view_version_ids"=>{"included"=>{"ids"=>[2492]}, "excluded"=>{}}, "id"=>"107", "content_view"=>{"id"=>"107"}, "api_version"=>"v2"} .. 2024-11-11T15:32:55 [I|app|05628762] Completed 202 Accepted in 662543ms (Views: 98.4ms | ActiveRecord: 20959.9ms | Allocations: 222030027)
Spot the PUT request and its duration and Allocations.
Actual behavior:
Planning the task takes many minutes (662 seconds in above example) and consumes >5GB memory (much depends on scaling).
Expected behavior:
Planning takes less than a minute, low memory usage of puma process.
Business Impact / Additional info:
I will provide a patch / PR soon.
- is blocked by
-
SAT-29346 [review] Deleting a CV version does not scale when a product has too many repos (cloned in CVs)
- Review