-
Epic
-
Resolution: Done
-
Major
-
None
-
None
-
None
User Story
As a user I want data ingestion, the UI, and APIs to run in a timely manner, so that I always have access to my data.
As dev/ops I want to ensure we can scale our application to handle reports from many customers and ensure that data is returned via APIs efficiently.
Prioritization / Business Case
- We need to tackle scaling large amounts of data as number of customer on-boarding to use cost management grows.
- We need to improve our ability to handle summarized data by partitioning it by time periods which will enable us to handle and present more data over time to users
- Keeping the raw data and daily data in the DB over time will lead to large costs $$$ to run cost management, moving this data to S3 is more affordable and is a strategy for data export + potential aligns with use in big data engines like Presto.
General Idea
- Have our report processors download/stream files directly when processing and switch worker statefulset -> deploymentconfig
- Table partitioning within existing schemas
- Big data processing setup (S3 bucket with parquet [and CSV for export]) during ingestion
- Spike on running presto (using what metering has done as a basis)
- After spike generate a plan for moving forward with new architecture
Impacts
- Data Backend
- Docs (if we enable data export)
Related Stories
https://issues.redhat.com/projects/COST/issues/COST-8
External Dependencies
- May be simpler to have the platform on the S3 buckets for cost purposes (we can create these with app-interface)
- May need more quota to run presto successfully in OpenShift (CPU/Mem per pod)
UX Requirements
- Is completion of a design/mock a prerequisite to working this epic or can portions be done concurrently?
If we enable data export, application-level settings will need to be updated (this is likely just a switch).
UI Requirements
- Does the UI require a API contract with the backend so that UI could be developed prior to completing the API work? None
Documentation Requirements
- What documentation is required to complete this epic?
We will need doc only if we deliver data export.
Backend Requirements
- Are there any prerequisites required before working this epic? No
QE Requirements
- Does QE need specific data or tooling to successfully test this epic? No
Release Criteria
- Can this epic be released as individual issues/tasks are completed? Yes
- Can the backend be released without the frontend? Yes
- Has QE approved this epic?
- is related to
-
COST-8 Partition Daily Tables for Long-Term Performance
-
- Closed
-