-
Bug
-
Resolution: Done
-
Major
-
1.7.1.Final
-
None
-
False
-
False
-
-
Debezium may miss some changes to a table, if they are done
- in a TX that was opened before the connector is started
- the change is done before the connector is started
- where that TX is committed only after the connector has started its initial consistent snapshot phase
What goes wrong:
Debezium/Logminer only mines SCNs >= the SCN at which the initial consistent snapshot phase operates. Let's call that SCN "Ts".
If you open a transaction before the connector is started, and commit it after the snapshot phase has begun, you'll have a series of SCNs in the snapshot that look something like this:
- Ts - 10 (INSERT prior to start of connector)
- Ts - 9 (UPDATE prior to start of connector)
- Ts + 5 (INSERT after the connector has started snapshotting)
- Ts + 10 (TX COMMIT)
Because the connector in streaming mode disregards all SCNs < Ts, its understanding of the transaction will consist of only SCNs (Ts + 5, Ts + 10). The changes made at Ts - 10 and Ts - 9, which come into effect due to the COMMIT, are never captured by the connector, and hence never emitted to Kafka.
For transactions that straddle the snapshot->streaming switchover, the connector should mine back to the beginning of the TX, the snapshot SCN is not the right cutoff point there.