Details
-
Task
-
Resolution: Unresolved
-
Major
-
None
-
None
Description
A while ago, we showed on the blog how to create an audit log based on change data events.
That example didn't cover though how those enriched change events are propagated to a sink database or DWH, allowing to analyse the full change history of a given customer or purchase order for instance. In this demo,
- take those enriched change events and the flattening SMT, to extract only the current row state (after)
- that SMT allows to add additional attributes from the record to the emitted flat row, so we could propagate that user name etc.
- always insert the records on the sink side instead of doing upserts
The Apache Camel Kafka Connect JDBC sink connector should be used on the sink side. Or perhaps the sink connectors for Google Big Query, Snowflake or another DWH. The sink primary key would be derived from the actual table PK + uniquely identifying attributes from the source block of change events (which would have to be propagated to the emitted row using the flattening SMT). I.e. each event would have its own unique PK value. While each event only ever should be inserted once into the sink database, this could happen another time after a connector crash ("at least once" semantics"), so upsert semantics should be applied to effectively ignore these duplicates.