-
Bug
-
Resolution: Done
-
Blocker
-
0.2
-
None
The MySQL connector records in the offset and source field for each change event various pieces of data about the position in the binlog where the corresponding binlog event appeared. This includes the binlog filename and position and, when the server is using GTIDs, the GTID (in the form of the updated GTID set) for the transaction in which the corresponding binlog event occurred. As the connector works, it periodically records these offsets so that upon startup the last offset can be used to position the connector's binlog client at the correct location.
When the connector is not using GTIDs, upon startup the last offset's binlog filename and position specify the location of the next binlog event the connector should start with. However, when the connector is using GTIDs, the last offset's GTID set is sent to the MySQL server as if the connector has seen all of those GTIDs.
The problem is that each GTID identifies a transaction that may have multiple binlog events, and currently all of the resulting change events will have an offset that includes this transaction's GTID. That means that if Kafka Connect were to record as it's last offset the offset from any of these change events except the last one in the transaction, and the connector were to crash without processing any other events, then when the connector restarts the GTID for this incomplete transaction will be included in the GTID set as being fully processed. IOW, it is possible the connector misses some binlog events upon startup.
So, the connector needs to instead record a GTID in the offset's GTID set only after the binlog events for that transaction have all been processed. And, because each GTID transaction can involve multiple binlog events, we probably also need to record the binlog event number in the offset (only when using GTIDs) so that we know to which binlog event our event corresponds.
UPDATE: This is actually not just a problem when using GTIDs. The connector can run into problems when restarting using only the binlog filename and position to identify the starting location, because the connector (including the binlog client library) can't simply just restart with a binlog event that is in the middle of a transaction. When it does, the connector is skips over the TABLE_MAP event for the affected table, and so the binlog client library doesn't know the structure of the table and thus cannot deserialize the rows in insert, update, or delete binlog events.