Uploaded image for project: 'OCP Technical Release Team'
  1. OCP Technical Release Team
  2. TRT-1104

Partition the TestRuns table in BigQuery

XMLWordPrintable

    • Icon: Story Story
    • Resolution: Done
    • Icon: Major Major
    • None
    • None
    • None
    • None
    • False
    • Hide

      None

      Show
      None
    • False
    • None
    • None
    • None
    • None

      We have a scheduled query running every hour to populate the table that is eventually used by test run aggregation to check historical pass rates.

      This table essentially is scanning TestRuns and JobRuns. TestRuns contains 204 GB, dating back to May 2021. Run hourly this is totalling about 3.5TB daily, costing us just under $20 a day, or $600/mo.

      In theory, this data could be used to check a tests historical pass rate on a GA date for a past release. We've never done this as far as I know, but it could be useful.

      Propose we: (a) snapshot the TestRuns and JobRuns tables, (b) delete everything older than 6 months, (c) repeat this process periodically as a team task.

      Snapshots we will pay for storage but this is not really a concern. We could delete older snapshots or have them auto expire on some date in the future. They can also be queried.

      Full details in: https://redhat-internal.slack.com/archives/C02K89U2EV8/p1687522102415719

              rhn-engineering-dgoodwin Devan Goodwin
              rhn-engineering-dgoodwin Devan Goodwin
              None
              None
              None
              None
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: