-
Story
-
Resolution: Done
-
Major
-
None
-
Product / Portfolio Work
-
8
-
False
-
-
False
-
None
-
Unset
-
None
-
-
-
The Notifications service currently offers a generic deduplication logic based on the identifier (UUID) of incoming events (through Kafka or the gateway). The event UUIDs are stored for 24 hours in the "kafka_message" DB table. If the identifier from an incoming event is already known in the DB, the event will be ignored entirely (not stored in the DB nor processed) by the Notifications engine. The event UUIDs are automatically purged from the "kafka_message" table by a cronjob after a 24 hours delay.
We want to change this and make the deduplication logic customizable for each service (tenant) integrated with Notifications.
Acceptance criteria:
DO NOT make any changes to the "events" DB table (schema, retention policy, purge...).
Make sure there's no cascade deletion from the "events" table to the "kafka_message" table as the latter may have a larger retention delay.- The current DB cleaner cronjob is enhanced or replaced to allow customizing the data retention policy in the "kafka_message" table. The current 24 hours policy should still be applied by default (no customization). Each tenant should be able to specify their own retention delay. We also need to allow a tenant to request a purge on a specific day of the month rather than the current rolling window.
- A new feature similar to the aggregation logic is introduced in notifications-engine to allow each tenant to provide us with programmatic logic that will determine if two events should be considered equals from a deduplication perspective. This might be based on complex conditions involving multiple event fields and possibly time conditions. There's a significant chance we'll write the condition as a SQL query.
- Regardless of how exactly the custom deduplication conditions are written (Java, SQL...), they are stored in the notifications-backend repository similarly to the daily digest aggregation logic.
- The current "kafka_message" table is modified to allow storing a minimal JSON data structure which will contain all the data required to deduplicate events. This needs to be backward compatible with the current deduplication logic. By default, the JSON data structure will contain the event identifier that is currently stored as a String. From a schema perspective, any JSON data should be allowed. The table should also include a field that will help identify applications while their data is purged from the DB.
- SQL queries executed against the modified "kafka_message" table should leverage Postgres' ability to query JSON data if possible.
- Performances are critical as this could slow down the whole events processing chain in Notifications. Measure execution times and put them into metrics. Make sure they are minimal. Create a new Grafana graph to monitor the execution times. Modify the indexing strategy on the "kafka_message" table if needed.
- This ticket will likely increase the amount of data stored in the DB, so we need to make sure it will not run out of space. Consider an increase of the DB storage if relevant.
- If Redis allows the exact same level of deduplication customization, it will likely replace the DB implementation eventually. We will start with the DB. We will migrate to Redis later, if possible. Keep that in mind while implementing the first version based on the DB.
- TO BE DETERMINED What should we do when an event is identified as a duplicate of an existing one based on the new deduplication logic? Ignore it entirely (not saved in the DB), or save it in the DB but not process it ? We could also make this decision customizable: keep the current behavior by default (ignore the event, do not save it) and allow tenants to request that duplicates are saved in the DB (which can help auditing through the Event Log).
This ticket could be split into 3 subtasks implemented sequentially:
- Allow customizing the DB cleaner logic.
- Change the data structure in kafka_message (DB) while keeping backward compatibility.
- Allow customizing the deduplication logic.
- is depended on by
-
RHCLOUD-42459 [RFE] Condition-Based Notification Suppression
-
- Closed
-
- is related to
-
RHCLOUD-43399 [IQE] Test the tenant-specific events deduplication
-
- New
-
- mentioned on