Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-5727

Columns are not excluded when doing incremental snapshots

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 2.1.0.Alpha1
    • None
    • core-library
    • None
    • False
    • None
    • False

      What Debezium connector do you use and what version?

      1.9.6.FINAL

      What is the connector configuration?

       

      {
          "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
          "incremental.snapshot.chunk.size": "5120",
          "slot.name": "slot_name",
          "tasks.max": "1",
          "publication.name": "publication_name",
          "value.converter.enhanced.avro.schema.support": "true",
          "value.converter.basic.auth.credentials.source": "USER_INFO",
          "signal.data.collection": "public.debezium_signals",
          "value.converter": "io.confluent.connect.avro.AvroConverter",
          "key.converter": "io.confluent.connect.avro.AvroConverter",
          "publication.autocreate.mode": "disabled",
          "database.dbname": "dbname",
          "database.user": "username",
          "database.server.name": "servername",
          "key.converter.enhanced.avro.schema.support": "true",
          "database.port": "5432",
          "plugin.name": "pgoutput",
          "key.converter.basic.auth.user.info": "auth user info",
          "value.converter.schema.registry.url": "registry url",
          "column.exclude.list": "public.table.column_a,public.table.column_b",
          "value.converter.basic.auth.user.info": "auth info",
          "database.hostname": "ip",
          "database.password": "db_password",
          "name": "connector-name",
          "table.include.list": "public.table,public.debezium_signals",
          "key.converter.schema.registry.url": "url",
          "key.converter.basic.auth.credentials.source": "USER_INFO",
          "snapshot.mode": "never"
      }
      

       

      What is the captured database version and mode of depoyment?

      PostgreSQL 13 managed by GCP

      What behaviour do you expect?

      I expect the columns specified in `column.exclude.list` to be excluded from the select query that the incremental (ad-hoc) snapshoter runs. This is because those columns are huge, and is very costly to exclude fetch that data when it's not really needed (it won't be included in the message written to Kafka anyway).

      As a workaround, I would like to be able to provide a customer query for the incremental snapshoter to run.

      What behaviour do you see?

      The property `column.exclude.list` is ignore for the incremental snapshoter. It instead selects all columns with `SELECT *`.

      There is no method to provide a custom query as far as I can see in the code (https://github.com/debezium/debezium/blob/e5463b81fa0c8151821db2b62d3e9826575cb3e0/debezium-core/src/main/java/io/debezium/jdbc/JdbcConnection.java#L1482-L1499)

      Do you see the same behaviour using the latest released Debezium version?

      Not unfortunately, but I don't see any change on the code in that area.

      Do you have the connector logs, ideally from start till finish?

      I don't think logs would help in this case

              Unassigned Unassigned
              enzo.cappa Enzo Cappa (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: