Uploaded image for project: 'Satellite'
  1. Satellite
  2. SAT-18330

Improve performance of SCAP report processing

XMLWordPrintable

    • Important
    • No Coverage
    • None

      Description of problem:
      Processing an uploaded OpenSCAP report is very inefficient. On medium sized Satellite (5k hosts) with reports regularly deleted, a report upload takes minutes, like:

      2020-01-14T06:30:03 [I|app|f6a9267e] Started POST "/api/v2/compliance/arf_reports/cf16fd84-bd71-a007-b8d0-b2a2ed5da52d/11/1579040998" for 1.2.3.4 at 2020-01-14 06:30:03 +0200
      2020-01-14T06:30:04 [I|app|f6a9267e] Processing by Api::V2::Compliance::ArfReportsController#create as JSON
      2020-01-14T06:30:04 [I|app|f6a9267e] Parameters: {"logs"=>"[FILTERED]", "digest"=>"65733bb1acf16fd84cb9fea719891f29c5cecf16fd8436149eb4987663d2b5bf", "metrics"=>

      {"passed"=>159, "failed"=>168, "othered"=>3}

      , ..}
      2020-01-14T06:35:51 [I|app|f6a9267e] Completed 200 OK in 347313ms (Views: 0.2ms | ActiveRecord: 339023.1ms)

      The above report upload happened when uploading 5 others in parallel (not big scaling), having these sizes of relevant tables (# of records there):

      messages 855
      hosts 4623
      foreman_openscap_asset_policies 39950
      reports 51545
      foreman_openscap_policy_arf_reports 165421
      logs 1995115

      postgres was inefficient on updates like:

      2020-01-14 06:30:22 CST LOG: duration: 879.003 ms execute <unnamed>: SELECT "logs".* FROM "logs" WHERE "logs"."source_id" = $1 ORDER BY logs.id, "logs"."id" DESC LIMIT $2

      2020-01-14 06:34:39 CST LOG: duration: 268178.182 ms execute <unnamed>: UPDATE "messages" SET "description" = $1, "rationale" = $2, "scap_references" = $3 WHERE "messages"."id" = $4

      The cause is evident: ARF reports split into lines and stored in "messages" table. Manipulation with the (even small / regularly cleaned) "messages" table is cumbersome.

      lzap wrote a preliminary version of a patch to boost the performance in https://community.theforeman.org/t/rfc-optimized-reports-storage/15573 .

      Version-Release number of selected component (if applicable):
      Sat 6.6.1

      How reproducible:
      100% on mid-scaled environment

      Steps to Reproduce:
      1. ??? Have hundreds of OpenSCAP clients / Hosts, with few different SCAP profiles, just to have bit various SCAP logs
      2. On a client, check time of SCAP client execution:

      time /usr/bin/foreman_scap_client 1

      3. Optionally, check the processing time in production.log for ArfReportsController#create (see above)

      Actual results:
      foreman_scap_client can even timeout during upload (cf bz1703951), and with increased timeout, it takes up to several minutes

      Expected results:
      Seconds to max few tens of seconds to upload.

      Additional info:

              jira-bugzilla-migration RH Bugzilla Integration
              rhn-support-pmoravec Pavel Moravec
              RH Bugzilla Integration RH Bugzilla Integration
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: