Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-7778

Unexpected table switch during incremental snapshot

    XMLWordPrintable

Details

    • False
    • None
    • False

    Description

      What Debezium connector do you use and what version?

      Postgres Connector provided with debezium/connect:2.6.0.Final

      What is the connector configuration?

      {
          "connector.class": "io.debezium.connector.postgresql.PostgresConnector",
          "database.user": "debezium",
          "database.dbname": "app",
          "slot.name": "debezium_slot",
          "publication.name": "cdc_publication",
          "database.server.name": "<server>",
          "plugin.name": "pgoutput",
          "database.port": "6432",
          "topic.prefix": "<server>",
          "database.hostname": "<master fqdn>",
          "database.password": "<password>",
          "name": "<server>",
          "incremental.snapshot.chunk.size": "65000",
          "batch.size": "16777216",
          "max.batch.size": "15728640",
          "max.queue.size": "1073741824",
          "max.queue.size.in.bytes": "4294967296",
          "linger.ms": "5000",
          "buffer.memory": "2147483648",
          "producer.override.max.request.size": "20971520",
          "snapshot.mode": "never",
          "signal.data.collection": "cdc.debezium_signal",
          "signal.enabled.channels": "source,kafka",
          "signal.kafka.bootstrap.servers": "cdc-debezium-kafka-brokers.infra-cdc-debezium:9092",
          "signal.kafka.topic": "debezium.signals",
          "signal.consumer.sasl.jaas.config": "<sasl_jaas_conf>",
          "signal.consumer.sasl.mechanism": "SCRAM-SHA-512",
          "signal.consumer.security.protocol": "SASL_PLAINTEXT",
          "heartbeat.interval.ms": "60000",
          "topic.creation.default.partitions": "6",
          "topic.creation.default.replication.factor": "3",
          "heartbeat.action.query": "update cdc.debezium_heartbeat set last_heartbeat_ts = NOW() where 1=1;",
          "database.initial.statements":"update cdc.debezium_heartbeat set last_heartbeat_ts = NOW() where 1=1;",
          "transforms": "unwrap",
          "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState",
          "transforms.unwrap.delete.handling.mode": "rewrite",
          "transforms.unwrap.add.fields": "op,table,source.ts_ms",
          "event.processing.failure.handling.mode": "warn"
      } 

       

       

      What is the captured database version and mode of depoyment?

      Yandex Cloud PostgreSQL 14

      What behaviour do you expect?

      1. I started incremental snapshot on 2 or more tables with signal
      2. Debezium takes table one of the tables — X
      3. Debezium snapshots table X
      4. Snapshot on table X completed
      5. Next table chosen, go to 2
      6. Finish

      What behaviour do you see?

      Option 1 [we got this both on postgres and mysql at large databases]:

      1. I started incremental snapshot on 2 or more tables with signal
      2. infinite loop:
        1. Debezium takes table one of the tables — X
        2. Debezium snapshots table X
        3. Other table chosen, go to 2

      Option 2 [we got this only on postgres]:

      1. I started incremental snapshot on 2 or more tables with signal
      2. Debezium takes table one of the tables — X
      3. Debezium snapshots table X
      4. Snapshot on table X DOES NOT completed
      5. Next table chosen, but offsets not cleared
      6. Infinite loop
        1. Failure because of incorrect type in select prepared statement

          (screenshot is from different connector, the difference is only in buffer sizes and incremental snapshot size)
      Caused by: io.debezium.DebeziumException: Database error while executing incremental snapshot for table 'DataCollection{id=public.changelogs, additionalCondition=, surrogateKey=}'  

      If you stuck at option 2, the only thing you can do — re-create connector with another name, because when you modify connector offsets in Kafka, connector does not take it to account and crashes right after start.

      My current bypass — snapshot not more that 1 table at a time

      Do you see the same behaviour using the latest relesead Debezium version?

      Yes, unfortunately it is not easy to reproduce, so we mostly tested on 2.6.0.Final, 2.5.3.Final seems affected too. Likely, the bug appeared after 2.4.X or 2.5.X

      I didn't encounter this problem until version 2.3.0.Final

      Do you have the connector logs, ideally from start till finish?

      (You might be asked later to provide DEBUG/TRACE level log)

      Unfortunately no, I have limited view-only access to logs due to company policies. We tried set DEBUG level for connector from API, but there are no effect, may be we did something wrong.

      In INFO-level log I can`t found something unordinary or strange before "io.debezium.DebeziumException: Database error while executing incremental snapshot for table"

      How to reproduce the issue using our tutorial deployment?

      Unfortunately, I can`t reproduce it on synthetic data. The option 1 reproduced on large database with 4.8 billion rows. The option 2 randomly reproduces on some of the databases (total 109 debezium-connected PostgreSQL).

      Attachments

        Activity

          People

            Unassigned Unassigned
            samplec0de Andrei Moskalev
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: