Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-2456

Allow to specify subset of captured tables to be snapshotted




      The available PostgreSQL snapshot mechanisms albeit powerful do not cover the following two scenarios:

      • A Postgres connector that has 3 whitelisted tables "tableA, tableB, tableC". After creating the connector, if we need to take a snapshot of only tableA and continue streaming, we will have to spin up a new connector.
      • A Postgres connector that has 2 whitelisted tables "tableA, tableB". There is no provision to add a new table such that only its snapshot is taken. There is no ability to merge connectors. We would need to spin up another connector to add another table to the setup.

      In my case, the database I am trying to listen to has around 200+ tables, and have already deployed more than 10+ connectors and feel its a hassle to maintain all of them. I am proposing a snapshot mechanism that is SELECTIVE in nature. The tables to be snapshotted are specified in the config with the key:

      "snapshot.selective.tables": "public.tableA,public.tableB"

      and the snapshot.mode to be set to selective. So Effectively only the subset of tables specified in the above configuration are considered for snapshots and are locked, post which streaming resumes. This would provide better mechanisms to maintain the connector and provide flexibility.




            Unassigned Unassigned
            kaushikiyer Kaushik Iyer (Inactive)
            0 Vote for this issue
            3 Start watching this issue