Loading...

XML

Word

Printable

Type: Story
Resolution: Done
Priority: Undefined
Fix Version/s: None
Affects Version/s: None
Component/s: insights-on-prem
Labels:
- cost-onprem-0.2

Epic Link:
CoP - Cost Management Trino replacement
Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False
Intelligence Requested:
Market:

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

Explore using python with a PoC as an alternative to the current SQL + postgres approach to fully replace trino and hive in both onprem and in the SaaS as the single solution. The PoC should follow this flow:
csv -> parquet -> python aggregation -> postgres DB inserts

This solution should be 1-1 parity with trino's current aggregation functionality for OCP and OCP on AWS.

The resulting PoC should include a benchmark report that details the results of using different payload sizes (number of rows for 1k, 10k 100k 500k and 1M) generated with nise to evaluate memory and aggregation time for both OCP and OCP on AWS.

Deliverables include:

github repository with the PoC
Technical documentation, including potential risks of this solution if adopted and benefits.
Benchmark results as mentioned earlier.

Assignee:: Jordi Gil

Reporter:: Jordi Gil

QA Contact:: Chad Crum

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2025/11/22 4:10 PM

Updated:: 2025/12/17 2:45 PM

Resolved:: 2025/12/17 2:45 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates