-
Feature Request
-
Resolution: Unresolved
-
Major
-
None
-
False
-
None
-
False
In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.
Feature request or enhancement
For feature requests or enhancements, provide this information, please:
Which use case/requirement will be addressed by the proposed feature?
Currently I'm told the Debezium JDBC connector can only start from the beginning. The Confluent JDBC connector has a setting `consumer.auto.offset.reset` to `latest` which starts the connector at the last offset. This would be valuable for a Disaster Recovery situation where you want to start a sink connector on a backup source topic, picking up where it left off at the time of the failure.
There is another, in my case more important, use case where I want to be able to start a sink connector arbitrarily from a given offset. Say there's a Debezium source connector streaming events from a 50GB MySQL table. At the other end is a sink connector sinking that table into an RDS instance. Starting this sink connector from the beginning of a 50GB table is going to take it weeks to finally get everything into the target table because of the limits in IOPS on most RDS/EC2 instances. This is necessary at least once, when the table doesn't exist, but ideally it would only be necessary one time. We're currently migrating from MSK to strimzi and this would be very helpful for us to avoid re-sinking from the beginning if we could choose an offset closer to the end of the source topic. Even `latest` in this case wouldn't guarantee a sink connector didn't miss anything in between since the original source topic's offsets are different than the new source topic's offsets. In the event the source connector loses its place in the binary logs this would also be great to avoid the need to re-sink the entire table.
Implementation ideas (optional)
Unfortunately I don't know the inner workings of this connector. I'm told that a separate feature of this connector is to allow for multiple topics to be sinked, and the arbitrary offset setting would conflict with this, but the settings could be mutually exclusive. In my case, I'm not as interested in sinking multiple topics as I am having more control over the topic I'm sinking.