Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-1766

Postgres Connector losing data on restart due to commit() being called before events produced to Kafka

      We've run into an issue where we get lost events when replicating from Postgres to Kafka while Kafka Connect restarts. This happens because a `commit()` call gets made before the events from `poll()` are actually committed to Kafka. When Kafka Connect gets terminated before some of the events actually are produced, on restart it picks up from the committed offset. This leads to data loss.

      The underlying issue of these callbacks being called out of order is an open issue against Kafka Connect: https://issues.apache.org/jira/browse/KAFKA-5716

      Here's a repo with steps to reproduce the issue: https://github.com/mmarvick-convoy/debezium-restart-issue

      Note that in the repro steps, we use a custom build of Debezium with some additional logging during the commit and poll callbacks. You can edit the Kafka Connect dockerfile to use the latest stable version of Debezium 1.0 without the custom build, and you'll still be able to reproduce. We've been running on Debezium 0.9.5 and can still reproduce, so it's not a recent regression. Because of the underlying issue in Kafka Connect, this has probably been an issue from the beginning.

            [DBZ-1766] Postgres Connector losing data on restart due to commit() being called before events produced to Kafka

            Released

            Jiri Pechanec added a comment - Released

            I think this probably impacts SQL Server and DB2 as well, but I'd prefer one of you add the tag (those ones are out of my area of knowledge)

            Thanks for pointing this out; actually it's not needed there, as for SQL Server and Db2 eventually nothing gets committed/acknowledged with the source database (unlike PG and Oracle). I'll look into removing that superfluous code for these two connectors.

            Gunnar Morling added a comment - I think this probably impacts SQL Server and DB2 as well, but I'd prefer one of you add the tag (those ones are out of my area of knowledge) Thanks for pointing this out; actually it's not needed there, as for SQL Server and Db2 eventually nothing gets committed/acknowledged with the source database (unlike PG and Oracle). I'll look into removing that superfluous code for these two connectors.

            One more thing that we're still investigating: It looks like when Debezium starts up, it first tries to use the offsets that Kafka Connect stored, and then falls back to the LSN flushed to Postgres. We think that the Kafka Connect offsets that are stored in Kafka might also be getting committed before the records are produced to the topic. We're not 100% sure yet, but if that turns out to be true, then either (a) Kafka Connect might need to be patched or (b) Debezium might be better off just using the offset stored in the Postgres replication slot and not the one stored by Kafka Connect.

            Michael Marvick (Inactive) added a comment - One more thing that we're still investigating: It looks like when Debezium starts up, it first tries to use the offsets that Kafka Connect stored, and then falls back to the LSN flushed to Postgres. We think that the Kafka Connect offsets that are stored in Kafka might also be getting committed before the records are produced to the topic. We're not 100% sure yet, but if that turns out to be true, then either (a) Kafka Connect might need to be patched or (b) Debezium might be better off just using the offset stored in the Postgres replication slot and not the one stored by Kafka Connect.

            Michael Marvick (Inactive) added a comment - - edited I think this probably impacts SQL Server and DB2 as well, but I'd prefer one of you add the tag (those ones are out of my area of knowledge) SQL Server: https://github.com/debezium/debezium/blob/master/debezium-connector-sqlserver/src/main/java/io/debezium/connector/sqlserver/SqlServerConnectorTask.java#L149-L161 DB2: https://github.com/debezium/debezium-incubator/blob/master/debezium-connector-db2/src/main/java/io/debezium/connector/db2/Db2ConnectorTask.java#L171-L183

            Our strategy has been to set a variable for the last processed lsn in commitRecord, but to not actually flush the lsn until commit. We made a fix against our internal fork of 0.9.5, and it passed our repro steps and is working well in prod so far. We'd be happy to put up a PR against master for Postgres with that approach (EDIT: it's here). Here's our fork's PR for reference: https://github.com/convoyinc/debezium/pull/17

            In 0.9.5, the source record doesn't keep track of the last fully processed lsn but just the lsn for the record, so we have to do a little more work to keep track of the previous lsn and next lsn, and then only commit the previous when the next one is different (indicating a whole batch was processed). In the latest code, the source record includes the last fully processed lsn, so this should be easier to patch in master.

            Michael Marvick (Inactive) added a comment - - edited Our strategy has been to set a variable for the last processed lsn in commitRecord , but to not actually flush the lsn until commit . We made a fix against our internal fork of 0.9.5, and it passed our repro steps and is working well in prod so far. We'd be happy to put up a PR against master for Postgres with that approach (EDIT: it's here ). Here's our fork's PR for reference: https://github.com/convoyinc/debezium/pull/17 In 0.9.5, the source record doesn't keep track of the last fully processed lsn but just the lsn for the record, so we have to do a little more work to keep track of the previous lsn and next lsn, and then only commit the previous when the next one is different (indicating a whole batch was processed). In the latest code, the source record includes the last fully processed lsn, so this should be easier to patch in master.

            Reading the KIP it seems that the easiest fix is to use commitRecord instead of commit . The question is performance implication of flushing for every record so it might make sense to flush only once per configurale number.

            Jiri Pechanec added a comment - Reading the KIP it seems that the easiest fix is to use commitRecord instead of commit . The question is performance implication of flushing for every record so it might make sense to flush only once per configurale number.

            This also affects the Oracle connector in all likelyhood.

            Gunnar Morling added a comment - This also affects the Oracle connector in all likelyhood.

              Unassigned Unassigned
              mmarvick-convoy Michael Marvick (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: