Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-2101

Demo: Keep *all* change events in a downstream database

    XMLWordPrintable

Details

    • Task
    • Resolution: Unresolved
    • Major
    • None
    • None
    • examples

    Description

      A while ago, we showed on the blog how to create an audit log based on change data events.

      That example didn't cover though how those enriched change events are propagated to a sink database or DWH, allowing to analyse the full change history of a given customer or purchase order for instance. In this demo,

      • take those enriched change events and the flattening SMT, to extract only the current row state (after)
      • that SMT allows to add additional attributes from the record to the emitted flat row, so we could propagate that user name etc.
      • always insert the records on the sink side instead of doing upserts

      The Apache Camel Kafka Connect JDBC sink connector should be used on the sink side. Or perhaps the sink connectors for Google Big Query, Snowflake or another DWH. The sink primary key would be derived from the actual table PK + uniquely identifying attributes from the source block of change events (which would have to be propagated to the emitted row using the flattening SMT). I.e. each event would have its own unique PK value. While each event only ever should be inserted once into the sink database, this could happen another time after a connector crash ("at least once" semantics"), so upsert semantics should be applied to effectively ignore these duplicates.

      Attachments

        Activity

          People

            Unassigned Unassigned
            gunnar.morling Gunnar Morling
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: