XML

Word

Printable

Type: Epic
Resolution: Done
Priority: Major
Fix Version/s: JIRA-MIgration-Old-Issues
Affects Version/s: None
Component/s: None
Labels:
None

Epic Name:
Data Scaling
Epic Status:
Done
Feature Link:
COST-10 - New data architecture that includes Data Hub as big data pipeline
Parent Link:
COST-10New data architecture that includes Data Hub as big data pipeline
Hierarchy Progress Bar:

0% To Do, 0% In Progress, 100% Done

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

User Story

As a user I want data ingestion, the UI, and APIs to run in a timely manner, so that I always have access to my data.

As dev/ops I want to ensure we can scale our application to handle reports from many customers and ensure that data is returned via APIs efficiently.

Prioritization / Business Case

We need to tackle scaling large amounts of data as number of customer on-boarding to use cost management grows.
We need to improve our ability to handle summarized data by partitioning it by time periods which will enable us to handle and present more data over time to users
Keeping the raw data and daily data in the DB over time will lead to large costs $$$ to run cost management, moving this data to S3 is more affordable and is a strategy for data export + potential aligns with use in big data engines like Presto.

General Idea

Have our report processors download/stream files directly when processing and switch worker statefulset -> deploymentconfig
Table partitioning within existing schemas
Big data processing setup (S3 bucket with parquet [and CSV for export]) during ingestion
Spike on running presto (using what metering has done as a basis)
After spike generate a plan for moving forward with new architecture

Impacts

Data Backend
Docs (if we enable data export)

External Dependencies

May be simpler to have the platform on the S3 buckets for cost purposes (we can create these with app-interface)
May need more quota to run presto successfully in OpenShift (CPU/Mem per pod)

UX Requirements

Is completion of a design/mock a prerequisite to working this epic or can portions be done concurrently?
If we enable data export, application-level settings will need to be updated (this is likely just a switch).

UI Requirements

Does the UI require a API contract with the backend so that UI could be developed prior to completing the API work? None

Documentation Requirements

What documentation is required to complete this epic?
We will need doc only if we deliver data export.

Backend Requirements

Are there any prerequisites required before working this epic? No

QE Requirements

Does QE need specific data or tooling to successfully test this epic? No

Release Criteria

Can this epic be released as individual issues/tasks are completed? Yes
Can the backend be released without the frontend? Yes
Has QE approved this epic?

is related to

COST-8 Partition Daily Tables for Long-Term Performance

Closed

Assignee:: Andrew Berglund (Inactive)

Reporter:: Andrew Berglund (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 12 Start watching this issue

Created:: 2020/04/30 3:44 PM

Updated:: 2024/08/30 4:24 PM

Details

Description

User Story

Prioritization / Business Case

General Idea

Impacts

Related Stories

External Dependencies

UX Requirements

UI Requirements

Documentation Requirements

Backend Requirements

QE Requirements

Release Criteria

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates