Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-9078

Column post processor for TOAST backfill based on a local state store

XMLWordPrintable

    • Icon: Feature Request Feature Request
    • Resolution: Unresolved
    • Icon: Major Major
    • Backlog
    • None
    • core-library
    • None
    • False
    • Hide

      None

      Show
      None
    • False

      The existing re-select post-processor can help with backfilling missing TOAST column values, but it has some critical shortcomings:

      • It is prone to data races; if a TOAST column actually does get updated between the point in time a change event without the TOAST value got created and the point in time the re-select query is issued, the wrong value will be added to the enriched event
      • It can create a high load on the source database if there are many updates to rows with unchanged TOAST columns, in particular as the SELECT isn't batched, i.e. it's all single key look-ups

      It would therefore be interesting to explore an alternative post processor which manages the values to be backfilled in a local state store, for instance using RocksDB or SlateDB. This would ensure that always the correct value is backfilled (the previous value for that column for a given row) and it would avoid any performance impact on the database. The trade-off is that this implementation will have its own persistent state which requires management. A remote store like SlateDB will avoid any local state.

              Unassigned Unassigned
              gunnar.morling Gunnar Morling
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated: