Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-8594

Data loss when primary key update is last operation in a transaction

XMLWordPrintable

      What behavior do you expect?

      We should lose/drop records for this scenario.

      What behavior do you see?

      There is a race condition in vitess-connector's (and maybe other connectors') implementation of `StreamingChangeEventSource` that only occurs when the last operation in the transaction is a primary key update (which result in multiple events: delete, tombstone, create). What has to happen is for a transaction `N` the last operation is a primary key update, and the connector set its offset context to transaction `N` prior to publishing to publishing the events associated with the update. The connector is shut down precisely at the point after it published the delete event & committed its offset (for transaction `N`  since this is the last operation in the transaction) but before it has sent the create event. When the connector resumes it uses its last stored offset (for transaction `N`) which source databases often treat as an exclusive start (e.g., Vitess, MySQL), meaning the first event it receives is for `N+1`. Thus the transaction `N` is not retransmitted and the create is lost.

      Will put out a PR for the fix in vitess-connector

              Unassigned Unassigned
              tthorn Thomas Thornton
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: