Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-2405

Snapshot misses trailing records if connector is deleted immediately after streaming begins

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • None
    • postgresql-connector
    • None

    Description

      I have automated the process of doing targeted snapshots from postgres into kafka, and I immediately delete the snapshot connector once the slot begins streaming.  This is the best workaround I know of given that Debezium does not support doing ad-hoc snapshots of existing tables on an existing connector.

      In any case, here is what happens.  The logs show that 7298 records were exported:

      Aug  5 15:17:10 debezium-postgres01 docker-compose[22699]: #033[33mconnect_2  |#033[0m 2020-08-05 20:17:10,275 INFO   Postgres|foo_prod|postgres-connector-task  #011 Finished exporting 7298 records for table 'foo'; total duration '00:00:02.176'   [io.debezium.relational.RelationalSnapshotChangeEventSource]host = debezium-postgres01.oak.enova.comsource = /var/log/docker-compose.logsourcetype = docker-compose-too_small 

      Immediately after this, after the slot began streaming, and I drop the connector:

      ug  5 15:17:10 debezium-postgres01 docker-compose[22699]: #033[33mconnect_2  |#033[0m 2020-08-05 20:17:10,287 INFO   Postgres|foo_prod|postgres-connector-task  Snapshot ended with SnapshotResult [status=COMPLETED, offset=PostgresOffsetContext [sourceInfo=source_info[server='foo_prod'db='foo_prod', lsn=EF4E/17DF4580, txId=36207810729, timestamp=2020-08-05T20:17:10.282Z, snapshot=FALSE, schema=logical_ticker, table=tick], partition={server=foo_prod}, lastSnapshotRecord=true]]   [io.debezium.pipeline.ChangeEventSourceCoordinator]host = foo.comsource = /var/log/docker-compose.logsourcetype = docker-compose-too_small 

      The slot started streaming with message "Obtained valid replication slot ReplicationSlot" and then I dropped the connector at 15:17:14.

      However, in reality, only about 6144 records were actually written to Kafka.  I have verified this by running KSQL and searching for the keys that were actually written interspersed with missing records.

      Please let me know if you would like me to provide additional logs.

      Perhaps this is expected behavior if offsets were not committed prior to drop????

      Attachments

        Activity

          People

            Unassigned Unassigned
            jfinzel Jeremy Finzel
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: