-
Bug
-
Resolution: Done
-
Critical
-
0.4.1
-
None
We are using the Debezium MySQL connector as described here:
https://wecode.wepay.com/posts/streaming-databases-in-realtime-with-mysql-debezium-kafka
We've hit an interesting edge case. When following the steps in the Adding new databases section, we discovered that one of our DBs would not come up on the MySQL DB with the new database loaded on it.
The problem appears to be that one of our connectors follows a very low-volume DB (gets updates only every day or two). This is an issue because offset (GTID) commits in Kafka connect (0.10.2.0) only appear to get triggered when the connector receives new messages. If the connector is sitting there for two days with no new messages, no new GTID commits will occur.
When we execute the reset master and set global GTID_PURGED commands (shown in the blog post), the binlog on the new MySQL DB is truncated. When we then try and move the low-volume connector over, it fails, because it tries to connect with a GTID from two days ago, which of course isn't on the new MySQL server because its binlog has just been purged. This problem doesn't exist for higher volume connectors because they commit frequently (every minute, or so), so if we wait a few minutes before moving them to the new MySQL machine, the latest committed GTID is in the new machine's binlog.
An ideal solution would be to have DBZ or KC automatically commit the most recent GTID that the connector has seen on a periodic basis. I briefly looked at the latest KC docs, and I don't see any such config.
I'm wondering if this issue is happening because we're using GTID server UUID whitelist, and table blacklists. Perhaps the interaction between Debezium and Kafka connect is getting messed up, such that offsets aren't getting committed. Debezium is receiving new GTID updates, just not from anything that matches the server UUID filter and table whitelist filters.
Lastly, I can't be certain, but I don't believe that we saw this issue with Kafka connect 0.10.0.1, before we upgraded to 0.10.2.0. Not 100% sure, but perhaps something changed in Kafka connect's behavior?