-
Bug
-
Resolution: Done
-
Major
-
1.9.5.Final
-
None
In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.
Bug report
For bug reports, provide this information, please:
What Debezium connector do you use and what version?
Vitess
What is the connector configuration?
Using vitess.offset.storage.per.task=true
What is the captured database version and mode of depoyment?
Vitess V13, AWS
What behaviour do you expect?
When dbzium connector went offline for some time and Vitess shard splits during that time (e.g. s1 split into s10 and s11), the current shard list from v$session will have latest (s10, s11) however the shard list persisted Kafka offset topic only contains old shard s1 (pointed at an old position gt1).
When vitess connector restarts, the current logic in VitessConnector.taskConfigs() will use the latest shard list (s10, s11) to do the task assignment and it will use "current" (i.e. tail of the binlog queue) for s10/s11. The correct behavior should use the old shard (s1) and old position (e.g. gt1) from persisted Kafka offset storage, this way the connector would subscribe to the exact point when it was stopped before. When the vtgate continue playing the binlog events from vttablet, it will eventually encounter the shard split binlog event where tablet stream from s1 vttablet will be closed and be replaced with tablet streams from s10 and s11. All these will happen transparently for vitess connector, vitess connector will seamlessly receives the new events from new shards from vtgate (as if it was connected online all the time).
What behaviour do you see?
Currently when the connector detects the current shard list from v$session are different from the shards from persisted offset storage, it will abort.
Do you see the same behaviour using the latest relesead Debezium version?
Yes
Do you have the connector logs, ideally from start till finish?
(You might be asked later to provide DEBUG/TRACE level log)
<Your answer>
How to reproduce the issue using our tutorial deployment?
Stop vitess connector and doing a shard split and restart vitess connector
Feature request or enhancement
For feature requests or enhancements, provide this information, please:
Which use case/requirement will be addressed by the proposed feature?
<Your answer>
Implementation ideas (optional)
Modify VitessConnector.taskConfigs() to favor the shards from persisted offset storage