Loading...

XML

Word

Printable

Type: Task
Resolution: Done
Priority: Critical
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- outlier

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
True
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

Context:
In certain scenarios, our system encounters challenges in transmitting billable usage data to marketplaces. These issues may arise from discrepancies between the values sent during remittance and the corresponding product information in the marketplace, or they could be attributed to specific restrictions imposed by individual marketplaces.

Hence the goal is to design and implement enhancements to the remittance process that will establish a resilient and robust system capable of effectively managing these transmission failures. The proposed solution involves implementing a mechanism that will automatically resend the billable usage data until the transmission is successfully completed. This improvement aims to ensure a seamless and accurate exchange of information between our system and the respective marketplaces.

Design doc: https://docs.google.com/document/d/1liKSpUL1WIRO_MhUKA7OEmNCx4n8foBRmLQEvDpDz6I/edit#heading=h.11kdiw50yuu7
KStreams POC: https://github.com/RedHatInsights/rhsm-subscriptions/pull/3012

Done:

Summarized Remittance
- Ensure a single remittance per hour for each marketplace, customer, and product metric.
- Design should clearly outline changes in each affected service.
DLQ per marketplace or a single DLQ for all:
- We can use Azure DLQ as a baseline example.

Failure Identification and Recovery
- What type of alerting & dashboards should be created & where (splunk/grafana/etc.)
- In case of failure how do we quickly identify what we remitted and what we didn't
- API to reset the remittance pending value (existing API uses time range) - this design should consider how/if changes are needed in this API.
- Types of failures we need to recover from:
  - Contract ingestion
  - Reading from prometheus
  - Processing Tallies
  - Marketplace sending.
  - ...

Recalculate Remittance:
- Since the re-tally won't be possible in future this design doc should consider API for "recalculate remittance".
- Explore alternative approaches:
  - Evaluate the feasibility of resending events.
  - Consider making necessary adjustments in related tables.
  - Explore the use of a flag to determine remittance failure and restart the process from that point.
Aggregate Usages Per Hour
- AWS and Azure marketplaces only allow one usage be sent per resource per hour but currently, when we retry a billable-usage it will fail because the current hour will already be billed for
- Design a way to extrapolate aggregation logic from swatch-producer-azure and use for all marketplace producers
- https://miro.com/app/board/uXjVNzq0xR4=/

Diagrams (miro/mermaid/plantuml):
- A miro board detailing the process from tally to marketplace.
- This needs to include failure cases as well as happy path so that we can analyze what happens when failures occur in each service.
- Please include something that can be included in documentation/source code detailing in a way that doesn't need svg generation every time in https://github.com/RedHatInsights/rhsm-subscriptions/tree/main/docs actual checkin to the source repository will happen after approval of the design.

is related to

SWATCH-2307 Fine Grained PAYG Errors

Backlog

SWATCH-2293 PAYG Monitoring & Alerting Improvements

In Progress

SWATCH-2284 Billable Usage Retries & Status Tracking Improvements

Closed

SWATCH-1964 Remove use of product name as an identifier. We should use product tag as an identifier instead.

Closed

relates to

SWATCH-2183 Design Billable Usage Aggregation Process

Closed

Assignee:: Kevin Howell

Reporter:: Kartik Shah

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2024/02/02 1:01 AM

Updated:: 2024/03/28 1:17 PM

Resolved:: 2024/03/26 8:31 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates