Uploaded image for project: 'Satellite'
  1. Satellite
  2. SAT-38727

[RFE] Dedicated worker for each of the queues

XMLWordPrintable

    • False
    • sat-endeavour
    • None
    • None
    • None
    • None

      Problem Statement

      Customer complains about webhook delay execution, after host creation . Precisely event was triggered at 9:59 and webhook was only executed at 10:13:47.

      ENV: 

      ====================
      satellite-6.15.3.1-2.el8sat.noarch                          Thu Sep  5 08:49:05 2024
      satellite-cli-6.15.3.1-2.el8sat.noarch                      Thu Sep  5 08:43:48 2024
      satellite-common-6.15.3.1-2.el8sat.noarch                   Thu Sep  5 08:49:05 2024
      satellite-installer-6.15.0.2-1.el8sat.noarch                Thu Sep  5 08:46:13 2024
      satellite-lifecycle-6.15.0-1.noarch                         Thu Sep  5 08:49:04 2024
      satellite-maintain-0.0.2-1.el8sat.noarch                    Sun Feb 25 08:54:30 2024
      ====================

       

       Experience & Workflow

      Below the log of one request to create host and ForemanWebhooks,

      CREATEHOST REQUEST

      ==================
      2024-11-22T08:16:46 [I|app|9003be3d] Started POST "/api/v2/hosts" for x.x.x.x at 2024-11-22 08:16:46 +0100
      1283993:2024-11-22T08:16:46 [I|app|9003be3d] Processing by Api::V2::HostsController#create as JSON
      ==================

      cat var/log/foreman/production.log | grep 9003be3d | grep DeliverWebhookJob

      =============
      2024-11-22T08:16:48 [I|app|9003be3d] Enqueued ForemanWebhooks::DeliverWebhookJob (Job ID: d332879f-81a7-4687-8f45-5e5b730aefe9) to Dynflow(default) with arguments: {:event_name=>"build_entered.event.foreman", :payload=>"", :headers=>"

      { "X-Shellhook-Arg-1": "x.x.x.x" }

      ", :url=>"https://x.x.x.x:9090/shellhook/bootdisk_create", :webhook_id=>1}
      2024-11-22T08:34:36 [I|app|9003be3d] Performing ForemanWebhooks::DeliverWebhookJob (Job ID: d332879f-81a7-4687-8f45-5e5b730aefe9) from Dynflow(default) enqueued at 2024-11-22T07:16:48Z with arguments: {:event_name=>"build_entered.event.foreman", :payload=>"", :headers=>"

      { "X-Shellhook-Arg-1": "x.x.x.x" }

      ", :url=>"https://x.x.x.x:9090/shellhook/bootdisk_create", :webhook_id=>1}
      2024-11-22T08:34:37 [I|app|9003be3d] Performed ForemanWebhooks::DeliverWebhookJob (Job ID: d332879f-81a7-4687-8f45-5e5b730aefe9) from Dynflow(default) in 555.88ms
      =============

      As we can read the hour to create the host is 08:16:46, and exactly two second later, at 2024-11-22T08:16:48 it's enqueued the ForemanWebhooks::DeliverWebhookJob.

      So when the host creation task finishes, which can take some time, it executes the ForemanWebhooks::DeliverWebhookJob that finishes after one second.

      I see no "gap" as on each Host provision request the first call to ForemanWebhooks::DeliverWebhookJob was made a few seconds after the initial POST

       

      Requirements

      It seems like Webhooks tasks are put in the same queue as remote jobs.
      if you go to url https://<satellite host>/foreman_tasks/sidekiq/busy
      there is queue: Queues: default, remote_execution
      this one is used for remote jobs and Webhooks too. so the problem was, that we had filled queue with remote job and Webhook task was queued and "waiting" for execution. 
      From customer point of view this is bad approach, Webhooks should not be place in same queue as remote jobs. it should be separate as for "Queues: hosts_queue". 
      Webhooks are related to actions taken on hosts, like create, built, destroyed etc and I really expect those to be done quickly, since it might be needed for server correct provisioning. 
      Remote jobs is not what is so urgent, it can be in queue for longer, I could understand it. 

      at the moment he increased foreman-dynflow-worker-instances to 5. but I would be to put Webhook tasks on separate queue, this would be the best by far.

      It's Possible not to place  webhook in the same queue as remote jobs, it should be separate as for "Queues: hosts_queue".

      Business Impact

      Slowness of processes

              Unassigned Unassigned
              rhn-support-gformisa Giovanni Formisano
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: