Current Setup
We have UI partitioned tables for each provider type by service or account. These tables hold the cost data by one or more keys (usage_start + account/service) at a highly granular level meaning that there are potentially many records in each UI table partition.
Problem Domain
At a large scale (thousands or accounts, thousands of records per service) we are running into query inefficiencies that are causing timeouts of the call to the endpoint.
We need an efficient way to get an ordered list of accounts | sub-guids | projects. etc ordered by total cost (or other defined order-by condition) for a specified time window.
This list generation query needs to be able to return within enough time for the api to post-process the data and return to the caller.
Acceptance Criteria
Queries over 10s or 100s of thousands of record keys within a time window responding well within the API timeout.
Defined, documented, maintainable process by which we can perform these queries and/or data processing.
Zero or minimal UI change for the graph on the cost explorer.