Uploaded image for project: 'Hawkular Metrics'
  1. Hawkular Metrics

[backport] Update compaction strategies for data tables


    • Type: Enhancement
    • Status: Closed (View Workflow)
    • Priority: Critical
    • Resolution: Done
    • Affects Version/s: 0.22.0
    • Fix Version/s: 0.21.8
    • Component/s: Core
    • Labels:


      In early version of Hawkular Metrics, well before we had the data_compressed table, we used size tiered compaction (STCS) for the data table. STCS works well for write-heavy workloads, but it can be problematic with expiring data which is common with time series data models like ours. In the 0.19.0 release, we switched over to date tiered compaction (DTCS) which was designed specifically for time series data models that use a global TTL. That is exactly what we have had in OpenShift.

      Things changed though when we introduced compaction. When compressed data is written out to the data_compressed table, we explicitly delete the columns in the data table that have been compressed. DTCS may no longer be the best strategy for the data table.

      The data_compressed table is also append-only and also uses TTL for expiring data; so, STCS is probably not the best option for it. DTCS is now deprecated in favor of time window compaction (TWCS) and is probably what we want.

      I have marked this as critical because disk usage and management has been a hot topic for us. Choosing an appropriate and well-tuned compaction strategy can make a huge, huge difference. For an example of how much a difference compaction can make, read this summary from a user who switched from DTCS to TWCS. The numbers are staggering.

        Gliffy Diagrams


            Issue Links



                • Assignee:
                  john.sanda John Sanda
                  john.sanda John Sanda
                • Votes:
                  0 Vote for this issue
                  1 Start watching this issue


                  • Created: