Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-5727

Columns are not excluded when doing incremental snapshots

    XMLWordPrintable

Details

    • Bug
    • Resolution: Done
    • Major
    • 2.1.0.Alpha1
    • None
    • core-library
    • None
    • False
    • None
    • False

    Description

      What Debezium connector do you use and what version?

      1.9.6.FINAL

      What is the connector configuration?

       

      {
          "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
          "incremental.snapshot.chunk.size": "5120",
          "slot.name": "slot_name",
          "tasks.max": "1",
          "publication.name": "publication_name",
          "value.converter.enhanced.avro.schema.support": "true",
          "value.converter.basic.auth.credentials.source": "USER_INFO",
          "signal.data.collection": "public.debezium_signals",
          "value.converter": "io.confluent.connect.avro.AvroConverter",
          "key.converter": "io.confluent.connect.avro.AvroConverter",
          "publication.autocreate.mode": "disabled",
          "database.dbname": "dbname",
          "database.user": "username",
          "database.server.name": "servername",
          "key.converter.enhanced.avro.schema.support": "true",
          "database.port": "5432",
          "plugin.name": "pgoutput",
          "key.converter.basic.auth.user.info": "auth user info",
          "value.converter.schema.registry.url": "registry url",
          "column.exclude.list": "public.table.column_a,public.table.column_b",
          "value.converter.basic.auth.user.info": "auth info",
          "database.hostname": "ip",
          "database.password": "db_password",
          "name": "connector-name",
          "table.include.list": "public.table,public.debezium_signals",
          "key.converter.schema.registry.url": "url",
          "key.converter.basic.auth.credentials.source": "USER_INFO",
          "snapshot.mode": "never"
      }
      

       

      What is the captured database version and mode of depoyment?

      PostgreSQL 13 managed by GCP

      What behaviour do you expect?

      I expect the columns specified in `column.exclude.list` to be excluded from the select query that the incremental (ad-hoc) snapshoter runs. This is because those columns are huge, and is very costly to exclude fetch that data when it's not really needed (it won't be included in the message written to Kafka anyway).

      As a workaround, I would like to be able to provide a customer query for the incremental snapshoter to run.

      What behaviour do you see?

      The property `column.exclude.list` is ignore for the incremental snapshoter. It instead selects all columns with `SELECT *`.

      There is no method to provide a custom query as far as I can see in the code (https://github.com/debezium/debezium/blob/e5463b81fa0c8151821db2b62d3e9826575cb3e0/debezium-core/src/main/java/io/debezium/jdbc/JdbcConnection.java#L1482-L1499)

      Do you see the same behaviour using the latest released Debezium version?

      Not unfortunately, but I don't see any change on the code in that area.

      Do you have the connector logs, ideally from start till finish?

      I don't think logs would help in this case

      Attachments

        Activity

          People

            Unassigned Unassigned
            enzo.cappa Enzo Cappa
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: