Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-2525

Generify exclusion of columns from snapshotting

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Done
    • Icon: Major Major
    • 1.7.0.CR1
    • None
    • core-library
    • None
    • False
    • False
    • Undefined

      When starting up, connectors can optionally do an initial snapshot so to capture the current status of the database, before transitioning to log reading mode. Such snapshot is a plain SELECT * FROM ... for all the captured tables.

      Most connectors currently select all the columns, also if actually some columns of a table are excluded as per the column include/exclude filter. This means these column values will be discarded right away when creating the Kafka record for the snapshot change event. This creates an unnecessary overhead (load on the database, network traffic), for instance when filtering out large BLOB columns. Ideally, filtered columns wouldn't be part of the snapshot select in the first place.

      The only connector to apply this optimization currently is the Debezium SQL Server connector, which excludes any columns filtered out via column.exclude.list. This should be pulled up into the relational base classes so all relational connectors benefit from that optimization.

              anmohant Anisha Mohanty
              gunnar.morling Gunnar Morling
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: