Uploaded image for project: 'Satellite'
  1. Satellite
  2. SAT-29345 Deleting a CV version does not scale when a product has too many repos (cloned in CVs)
  3. SAT-29453

[QE] Deleting a CV version does not scale when a product has too many repos (cloned in CVs)

XMLWordPrintable

    • Icon: Sub-task Sub-task
    • Resolution: Unresolved
    • Icon: Undefined Undefined
    • None
    • None
    • None
    • Sprint 142

      Description of problem:
      Having a product with many repos that each repo is in many CV versions, an attempt to delete a bigger CV version (of say 100 repos) takes a lot of time and memory.

      In a customer story behind this, puma worker planning a foreman task consumed 11GB memory.

      On my reproducer, I got 6-8GB easily, and the planning took 15 minutes.

      With a simple change, the memory consumption can be reduced to 200MB-ish of RAM and to 20ish seconds.
       
      The key "not scalling well" factor is https://github.com/Katello/katello/blob/master/app/lib/actions/katello/repository/destroy.rb#L72-L77 during deletion of a repository.

      Replacing that cycle by a single ActiveRecord query prevents manipulation with individual repository objects and save a lot of time+space.

      How reproducible:
      100%
       

      Is this issue a regression from an earlier version:
      yes (performance regression since ACS feature added)
       

      Steps to Reproduce:

      1. Create a product with 200 repos:

      hammer product create --organization-id 1 --name ZOO_product
      for i in $(seq 1 200); do
        echo "repository create --organization-id 1 --product ZOO_product --name ZOO_repo_${i} --content-type yum --download-policy on_demand --url https://repos.fedorapeople.org/repos/pulp/pulp/demo_repos/zoo/"
      done | hammer shell
      

      2. Create one CV with 100 of the repos:

      hammer content-view create --organization-id 1 --name CV_zoo_manyrepos
      for i in $(seq 1 100); do
        echo "content-view add-repository --organization-id 1 --name CV_zoo_manyrepos --product ZOO_product --repository ZOO_repo_${i}"
      done | hammer shell
      

      3. Create another CV with all the 200 repos (or even with completely disjunct set of repos, like "seq 101 200"):

      hammer content-view create --organization-id 1 --name CV_zoo_HUGErepos
      for i in $(seq 1 200); do
        echo "content-view add-repository --organization-id 1 --name CV_zoo_HUGErepos --product ZOO_product --repository ZOO_repo_${i}"
      done | hammer shell
      

      4. Publish some versions of both CVs - the more the better:

      for i in $(seq 1 10); do
        echo "content-view publish --organization-id 1 --name CV_zoo_manyrepos"
        echo "content-view publish --organization-id 1 --name CV_zoo_HUGErepos"
      done | hammer shell
      

      5. Restart `foreman` service to see memory usage required to process below request.

      6. Delete either CV version, e.g. via WebUI.

      7. Monitor memory usage of puma workers, and spot in /var/log/foreman/production.log:

      2024-11-11T15:21:53 [I|app|05628762] Started PUT "/katello/api/content_views/107/bulk_delete_versions" for ::1 at 2024-11-11 15:21:53 +0100
      2024-11-11T15:21:53 [I|app|05628762] Processing by Katello::Api::V2::ContentViewsController#bulk_delete_versions as JSON
      2024-11-11T15:21:53 [I|app|05628762]   Parameters: {"bulk_content_view_version_ids"=>{"included"=>{"ids"=>[2492]}, "excluded"=>{}}, "id"=>"107", "content_view"=>{"id"=>"107"}, "api_version"=>"v2"}
      ..
      2024-11-11T15:32:55 [I|app|05628762] Completed 202 Accepted in 662543ms (Views: 98.4ms | ActiveRecord: 20959.9ms | Allocations: 222030027)
      

      Spot the PUT request and its duration and Allocations.

      Actual behavior:
      Planning the task takes many minutes (662 seconds in above example) and consumes >5GB memory (much depends on scaling).

      Expected behavior:
      Planning takes less than a minute, low memory usage of puma process.

      Business Impact / Additional info:
      I will provide a patch / PR soon.

      QE Tracker for https://issues.redhat.com/browse/SAT-29345

              Unassigned Unassigned
              satellite-jira-automation@redhat.com Satellite Jira-Automation
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: