-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
False
-
None
-
False
Bug report
We got the following in one of our SQLServer connectors:
java.lang.IllegalStateException: Detected changed end offset of database history topic (previous: 8238, current: 8245). Make sure that the same history topic isn't shared by multiple connector instances. at io.debezium.relational.history.KafkaDatabaseHistory.getEndOffsetOfDbHistoryTopic(KafkaDatabaseHistory.java:369) at io.debezium.relational.history.KafkaDatabaseHistory.recoverRecords(KafkaDatabaseHistory.java:312) at io.debezium.relational.history.AbstractDatabaseHistory.recover(AbstractDatabaseHistory.java:112) at io.debezium.relational.history.DatabaseHistory.recover(DatabaseHistory.java:163) at io.debezium.relational.HistorizedRelationalDatabaseSchema.recover(HistorizedRelationalDatabaseSchema.java:62) at io.debezium.connector.sqlserver.SqlServerConnectorTask.start(SqlServerConnectorTask.java:87) at io.debezium.connector.common.BaseSourceTask.start(BaseSourceTask.java:130) at io.debezium.connector.common.BaseSourceTask.startIfNeededAndPossible(BaseSourceTask.java:207) at io.debezium.connector.common.BaseSourceTask.poll(BaseSourceTask.java:148) at org.apache.kafka.connect.runtime.WorkerSourceTask.poll(WorkerSourceTask.java:291) at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:248) at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:186) at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:241) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829)
We use unique names for all our dbhistory topics. But in this case the connector has several tasks and we can restart one of them while other tasks are running.
What Debezium connector do you use and what version?
SQLServer, 1.9.5
What is the connector configuration?
class: io.debezium.connector.sqlserver.SqlServerConnector config: database.history.kafka.recovery.attempts: 8 database.history.kafka.recovery.poll.interval.ms: 10000 database.history.kafka.topic: ***.dbhistory database.names: *long list of databases* include.schema.changes: "false" max.iteration.transactions: 10000 schema.name.adjustment.mode: none ... pause: false tasksMax: 3
What behaviour do you expect?
Recover procedure should finish correctly without any issue.
What behaviour do you see?
Exception is thrown with confusing message.
Do you see the same behaviour using the latest relesead Debezium version?
The code is still there.
How to reproduce the issue using our tutorial deployment?
Recover procedure should be rather slow (maybe with some pauses or big enough number of messages to process). At the same time the other tasks should produce message to dbhistory topic.
Implementation ideas (optional)
As the first step I suggest to get rid of this check:
// The end offset should never change during recovery; doing this check here just as - a rather weak - attempt // to spot other connectors that share the same history topic accidentally if (previousEndOffset != null && !previousEndOffset.equals(endOffset)) { throw new IllegalStateException("Detected changed end offset of database history topic (previous: " + previousEndOffset + ", current: " + endOffset + "). Make sure that the same history topic isn't shared by multiple connector instances."); }
According to the comment the check wasn't perfect at the beginning. But now when we have connectors which can work in multi task mode it leads to exceptions in valid scenarios.
After that we may start thinking about alternative solution or in the worst case a fixed version of the same check which takes in account multi task mode.