-
Bug
-
Resolution: Unresolved
-
Normal
-
6.17.0
-
False
-
Moderate
-
sat-artemis
-
None
-
None
-
None
-
None
Description of problem:
When you try to sync more AC repos concurrently to a Capsule than available pulpcore workers, the sync fails leaving OOMs in logs.
How reproducible:
Always for the given setup.
Is this issue a regression from an earlier version:
Hard to say, previous version suffered from deadlock so the task failed for a different reason: https://issues.redhat.com/browse/SAT-27939
Steps to Reproduce:
1. Start with fresh Satellite and Capsule. I used standard flavour SatLab VMs - Capsule with 24 GB RAM, 4 GB SWAP, 6 CPUs -> 6 pulpcore-workers running.
2. Set the Capsule with Immediate download policy and disable the autosync on CV promote.
3. Create a product with some mid-size Ansible Collection repo and sync it. I used this one from galaxy.ansible.com (470 collections):
- name: prometheus.prometheus
version: "0.25.0"
4. Create 2 Lifecycle Environments and assign them to Capsule.
5. Create 3 Content Views, add the repo, publish and promote them to LCEs from 4. Now you have 6 repos, one for each pulpcore-worker, which is the minimal setup to hit the issue.
6. Trigger the Complete sync on the Capsule.
Actual behavior:
The sync task fails with "Errors: Pulp task error"
Expected behavior:
One or more of the followings:
- ability to successfully sync more AC repos than workers count
- lower RAM usage per worker
- minimal requirements RAM / CPU redefined
- better hint what happened in the Sat task result