-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
None
Description of problem:
I am suggesting an improvement in scheduling pulp tasks. Not sure if/how is it applicable to pulp-3, but in pulp-2, and in the way how katello requests repo sync task, the task automatically spawns a publish task in not ideal way. See particular example:
- multiple repos are synced concurrently (imagine e.g. a CV publish/promote, or Caps sync, or a Sync plan or similar), say there are 10 such tasks
- there are say 4 pulp workers, meaning 6 sync tasks are kept in resource_manager in a backlog
- the very first completed sync task spawns - at the end of the sync - a publish repo task

- this new task is added to the end of the resource_manager's queue
- other sync tasks are processed, and only then the first publish task gets to a pulp worker
We should re-order the tasks execution, such that the publish task is executed immediately, and by the same worker, after the sync task. Rationale:
- a repo with a new content will be available to clients faster. Other repos wont be affected, as they will be re-published at the same time (the re-ordering of tasks happen before the other publish tasks, right?) - a kind of optimisation
- also, dynflow waits for completion of whole sync+publish pair of tasks; making this period shorter saves some polling from dynflow to pulp, as well as less concurrent dynflow steps will be running at some time
here, this step has another inefficiency; the publish task is spawned also when no new content has been synced/changed; this information is know to the sync task, but a no-op publish task is spawned, that will complete in zero time with 'Skipped: Repository content has not changed since last publish.'. sync task should prevent spawning such no-op publish, as an optimization.
Version-Release number of selected component (if applicable):
Sat 6.7.0
How reproducible:
100%
Steps to Reproduce:
0. Below steps can be applied to any pulp server. Satellite, or Capsule.
1. Optionally, to see the behaviour more straightforwardly, artificially set just one pulp worker: e.g. in /etc/default/pulp_workers, set PULP_CONCURRENCY=1 and restart pulp services.
2. Trigger several repo synces - depending where you experiment, try a Sync plan (on Sat) or promote a CV with many repos (on Sat), or invoke a Caps sync to a new Caps
3. Once the bulk action completes, find out (e.g. by expanding dynflow steps of the foreman task) the sequence of sync+publish tasks
Actual results:
3. shows me:
pulp:action:sync repo_id:8c1fb341-6b7b-43a1-bee8-28c1fc58ad8e started:'2020-06-05T16:21:57Z' completed:'2020-06-05T16:22:06Z'
pulp:action:publish repo_id:8c1fb341-6b7b-43a1-bee8-28c1fc58ad8e started:'2020-06-05T16:23:21Z' completed:'2020-06-05T16:23:21Z'
pulp:action:sync repo_id:a69cea2d-7ba4-4cff-85e4-d96ad30ca5ca started:'2020-06-05T16:22:06Z' completed:'2020-06-05T16:22:14Z'
pulp:action:publish repo_id:a69cea2d-7ba4-4cff-85e4-d96ad30ca5ca started:'2020-06-05T16:23:21Z' completed:'2020-06-05T16:23:21Z'
pulp:action:sync repo_id:4e20a22d-81a5-4b78-ad4d-b8f9ff3df953 started:'2020-06-05T16:22:15Z' completed:'2020-06-05T16:22:21Z'
pulp:action:publish repo_id:4e20a22d-81a5-4b78-ad4d-b8f9ff3df953 started:'2020-06-05T16:23:21Z' completed:'2020-06-05T16:23:21Z'
pulp:action:sync repo_id:8c1e1c97-ecf2-4885-b7f8-8b508be0b837 started:'2020-06-05T16:22:21Z' completed:'2020-06-05T16:22:27Z'
pulp:action:publish repo_id:8c1e1c97-ecf2-4885-b7f8-8b508be0b837 started:'2020-06-05T16:23:22Z' completed:'2020-06-05T16:23:22Z'
pulp:action:sync repo_id:0ef874dc-bf93-4e8f-a605-b2582d7a4929 started:'2020-06-05T16:21:50Z' completed:'2020-06-05T16:21:56Z'
pulp:action:publish repo_id:0ef874dc-bf93-4e8f-a605-b2582d7a4929 started:'2020-06-05T16:23:20Z' completed:'2020-06-05T16:23:20Z'
pulp:action:sync repo_id:5eb348a7-5e0c-443b-a46f-1a15d685af5f started:'2020-06-05T16:22:34Z' completed:'2020-06-05T16:22:40Z'
pulp:action:publish repo_id:5eb348a7-5e0c-443b-a46f-1a15d685af5f started:'2020-06-05T16:23:22Z' completed:'2020-06-05T16:23:22Z'
pulp:action:sync repo_id:19ab5ba9-b941-41d0-bb82-9370e4f3f7a5 started:'2020-06-05T16:22:47Z' completed:'2020-06-05T16:22:55Z'
pulp:action:publish repo_id:19ab5ba9-b941-41d0-bb82-9370e4f3f7a5 started:'2020-06-05T16:23:23Z' completed:'2020-06-05T16:23:23Z'
pulp:action:sync repo_id:574bb92b-fc6a-4d49-afb8-b4d903155bc3 started:'2020-06-05T16:22:27Z' completed:'2020-06-05T16:22:33Z'
pulp:action:publish repo_id:574bb92b-fc6a-4d49-afb8-b4d903155bc3 started:'2020-06-05T16:23:22Z' completed:'2020-06-05T16:23:22Z'
pulp:action:sync repo_id:7160dfcf-9025-4d95-ac1f-14b97829c5d3 started:'2020-06-05T16:22:40Z' completed:'2020-06-05T16:22:46Z'
pulp:action:publish repo_id:7160dfcf-9025-4d95-ac1f-14b97829c5d3 started:'2020-06-05T16:23:22Z' completed:'2020-06-05T16:23:22Z'
pulp:action:sync repo_id:42ffcaa8-4979-4d07-9538-221604ad8522 started:'2020-06-05T16:22:55Z' completed:'2020-06-05T16:23:01Z'
pulp:action:publish repo_id:42ffcaa8-4979-4d07-9538-221604ad8522 started:'2020-06-05T16:23:23Z' completed:'2020-06-05T16:23:23Z'
pulp:action:sync repo_id:4bf21816-6c63-4444-8e65-ad56a53201ff started:'2020-06-05T16:23:02Z' completed:'2020-06-05T16:23:20Z'
pulp:action:publish repo_id:4bf21816-6c63-4444-8e65-ad56a53201ff started:'2020-06-05T16:23:23Z' completed:'2020-06-05T16:23:23Z'
when I sort it by "started" column (sort -nrk3):
pulp:action:sync repo_id:a69cea2d-7ba4-4cff-85e4-d96ad30ca5ca started:'2020-06-05T16:22:06Z' completed:'2020-06-05T16:22:14Z'
pulp:action:sync repo_id:8c1fb341-6b7b-43a1-bee8-28c1fc58ad8e started:'2020-06-05T16:21:57Z' completed:'2020-06-05T16:22:06Z'
pulp:action:sync repo_id:8c1e1c97-ecf2-4885-b7f8-8b508be0b837 started:'2020-06-05T16:22:21Z' completed:'2020-06-05T16:22:27Z'
pulp:action:sync repo_id:7160dfcf-9025-4d95-ac1f-14b97829c5d3 started:'2020-06-05T16:22:40Z' completed:'2020-06-05T16:22:46Z'
pulp:action:sync repo_id:5eb348a7-5e0c-443b-a46f-1a15d685af5f started:'2020-06-05T16:22:34Z' completed:'2020-06-05T16:22:40Z'
pulp:action:sync repo_id:574bb92b-fc6a-4d49-afb8-b4d903155bc3 started:'2020-06-05T16:22:27Z' completed:'2020-06-05T16:22:33Z'
pulp:action:sync repo_id:4e20a22d-81a5-4b78-ad4d-b8f9ff3df953 started:'2020-06-05T16:22:15Z' completed:'2020-06-05T16:22:21Z'
pulp:action:sync repo_id:4bf21816-6c63-4444-8e65-ad56a53201ff started:'2020-06-05T16:23:02Z' completed:'2020-06-05T16:23:20Z'
pulp:action:sync repo_id:42ffcaa8-4979-4d07-9538-221604ad8522 started:'2020-06-05T16:22:55Z' completed:'2020-06-05T16:23:01Z'
pulp:action:sync repo_id:19ab5ba9-b941-41d0-bb82-9370e4f3f7a5 started:'2020-06-05T16:22:47Z' completed:'2020-06-05T16:22:55Z'
pulp:action:sync repo_id:0ef874dc-bf93-4e8f-a605-b2582d7a4929 started:'2020-06-05T16:21:50Z' completed:'2020-06-05T16:21:56Z'
pulp:action:publish repo_id:a69cea2d-7ba4-4cff-85e4-d96ad30ca5ca started:'2020-06-05T16:23:21Z' completed:'2020-06-05T16:23:21Z'
pulp:action:publish repo_id:8c1fb341-6b7b-43a1-bee8-28c1fc58ad8e started:'2020-06-05T16:23:21Z' completed:'2020-06-05T16:23:21Z'
pulp:action:publish repo_id:8c1e1c97-ecf2-4885-b7f8-8b508be0b837 started:'2020-06-05T16:23:22Z' completed:'2020-06-05T16:23:22Z'
pulp:action:publish repo_id:7160dfcf-9025-4d95-ac1f-14b97829c5d3 started:'2020-06-05T16:23:22Z' completed:'2020-06-05T16:23:22Z'
pulp:action:publish repo_id:5eb348a7-5e0c-443b-a46f-1a15d685af5f started:'2020-06-05T16:23:22Z' completed:'2020-06-05T16:23:22Z'
pulp:action:publish repo_id:574bb92b-fc6a-4d49-afb8-b4d903155bc3 started:'2020-06-05T16:23:22Z' completed:'2020-06-05T16:23:22Z'
pulp:action:publish repo_id:4e20a22d-81a5-4b78-ad4d-b8f9ff3df953 started:'2020-06-05T16:23:21Z' completed:'2020-06-05T16:23:21Z'
pulp:action:publish repo_id:4bf21816-6c63-4444-8e65-ad56a53201ff started:'2020-06-05T16:23:23Z' completed:'2020-06-05T16:23:23Z'
pulp:action:publish repo_id:42ffcaa8-4979-4d07-9538-221604ad8522 started:'2020-06-05T16:23:23Z' completed:'2020-06-05T16:23:23Z'
pulp:action:publish repo_id:19ab5ba9-b941-41d0-bb82-9370e4f3f7a5 started:'2020-06-05T16:23:23Z' completed:'2020-06-05T16:23:23Z'
pulp:action:publish repo_id:0ef874dc-bf93-4e8f-a605-b2582d7a4929 started:'2020-06-05T16:23:20Z' completed:'2020-06-05T16:23:20Z'
See that all repos were first synced, and even then the very-first synced repo was published. Meantime, old content was still available to consumers.
Expected results:
A publish follows just after a sync of the same repo.
Additional info: