-
Task
-
Resolution: Unresolved
-
Major
-
None
-
None
-
8
-
False
-
-
False
-
subs-swatch-thunder
-
-
-
Swatch Thunder Sprint 2
This is required for SWATCH-3607. The idea is to improve the performance of this API which sometimes causes 504 Gateway timeout.
The problem is that to get all the billing_account_ids, we're using the host_tally_buckets (because it's the only table that contains the product_id for the instances). And to get this data, the SQL query is very expensive:
select distinct htb1_0.product_id,htb1_0.billing_provider,htb1_0.billing_account_id from host_tally_buckets htb1_0 join hosts h1_0 on h1_0.id=htb1_0.host_id where h1_0.org_id=? and htb1_0.billing_provider<>'_ANY' and htb1_0.billing_account_id<>'_ANY' and h1_0.last_seen>=? and htb1_0.product_id='rosa' order by htb1_0.billing_provider,htb1_0.billing_account_id
And the cost of running this query having a data set of around 1M of rows in hosts and host_tally_buckets is:
Unique (cost=11912.15..11926.81 rows=118 width=28) -> Gather Merge (cost=11912.15..11926.22 rows=118 width=28) Workers Planned: 2 -> Unique (cost=10912.13..10912.57 rows=59 width=28) -> Sort (cost=10912.13..10912.28 rows=59 width=28) Sort Key: htb1_0.billing_provider, htb1_0.billing_account_id -> Nested Loop (cost=374.31..10910.39 rows=59 width=28) -> Parallel Bitmap Heap Scan on host_tally_buckets htb1_0 (cost=373.89..8351.96 rows=597 width=44) Recheck Cond: ((product_id)::text = 'rosa'::text) Filter: (((billing_provider)::text <> '_ANY'::text) AND ((billing_account_id)::text <> '_ANY'::text)) -> Bitmap Index Scan on idx_buckets_pid (cost=0.00..373.53 rows=33747 width=0) Index Cond: ((product_id)::text = 'rosa'::text) -> Index Only Scan using idx_hosts_view on hosts h1_0 (cost=0.42..4.29 rows=1 width=16) Index Cond: ((org_id = '6340056'::text) AND (id = htb1_0.host_id) AND (last_seen >= '2024-06-28 00:00:00-04'::timestamp with time zone))
We tried to change an existing index as part of SWATCH-3961, but when trying it using a large data set, the cost was exactly the same (the index was not used at all - see more information about this in the JIRA ticket).
Therefore, this task is about to investigate how to improve the performance of this API. I would avoid using the host_tally_buckets and keep the product_id data by host in a new table.
Note that query is using the billing_account_ids from the buckets table, however the host table also has the billing_account_ids, is this duplicated data?
Acceptance Criteria
- Design document about the required changes to improve this API
- Clarify if we can use the billing_account_ids field from the host table
- is related to
-
SWATCH-3961 Improve performance of the "/v1/instances/billing_account_ids" API
-
- Closed
-