Uploaded image for project: 'Subscription Watch'
  1. Subscription Watch
  2. SWATCH-3648

Spike: Investigate the accuracy of the PAYG metrics for alerting

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Unresolved
    • Icon: Normal Normal
    • None
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • subs-swatch-lightning

      As part of SWATCH-3571, we verified that the accuracy for the following metrics are around 90%:

      • Metered: swatch_metrics_ingested_usage_total
      • Tallied: swatch_tally_tallied_usage_total
      • Remitted Success: swatch_producer_metered_total

      On the other hand, the following metrics can't be used for alerting because the usages are being "retried" if previous remittances failed to be submitted:

      • Billing Pending - swatch_billable_usage_total
      • Remitted Failures - swatch_producer_metered_total

      I reported SWATCH-3633 to investigate if we can fix these two metrics.

      However, we want to better understand why the accuracy for the first three metrics are 90% and check how better to write the alerts.

      Acceptance Criteria

      • Update the verification steps from https://docs.google.com/document/d/1liKSpUL1WIRO_MhUKA7OEmNCx4n8foBRmLQEvDpDz6I/edit?usp=sharing to group by date instead of month and check whether the accuracy of the data is better
      • Analyse how to write the alert taking into account that we'll be using only the grafana metrics. For example: if the metered metric has value 100, and the tallied metric has the value 20. There is a 80 of difference which is more than the 90% of threshold, so we need to get an alert to further investigation.

              Unassigned Unassigned
              jcarvaja@redhat.com Jose Carvajal Hilario
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated: