-
Epic
-
Resolution: Unresolved
-
Undefined
-
None
-
None
-
None
-
None
-
False
-
-
False
-
subs-swatch-lightning
-
100% To Do, 0% In Progress, 0% Done
We need to support per-SLA and per-Usage over-usage detection and include these dimensions in the notification events sent to customers.
Right now, swatch-utilization only processes utilization summaries with _ANY SLA and _ANY Usage. This was an intentional short-term decision to avoid false positives caused by a capacity lookup bug (now fixed). However, the end goal is to detect over-usage at a more granular level – for example, telling a customer "your Premium/Production RHEL usage exceeds your subscribed capacity" rather than just "your RHEL usage exceeds capacity".
To make this work, two things need to happen:
1. Add sla and usage to the notification event context. The notification action built by CustomerOverUsageService.buildContext() currently only includes product_id and metric_id. We need to also pass sla and usage as additional properties so the email template can reference them. This means removing the _ANY-only filter in UtilizationSummaryPayloadValidator and processing all SLA/Usage combinations. We also need to decide how to handle deduplication – a customer might get separate alerts for _ANY, Premium/Production, and Standard/Development-Test for the same product and metric, which could be noisy.
2. Update the email template to display SLA and Usage. The email template (managed by the notifications/docs team) needs to be updated to render the new sla and usage fields in a user-friendly way. For example, showing "Service Level: Premium" and "Usage: Production" when those values are not _ANY, or omitting them when the alert is for the aggregate _ANY view. This work involves the docs team since they own the email template content.