-
Bug
-
Resolution: Done
-
Major
-
1.7.0.Final
-
None
-
False
-
False
-
-
If two capture instances exist for a table, the connector determines the sequence of their usage by their start LSN. If the LSNs are the same, the result of choosing which of the two is current and which is future is undefined (see SqlServerStreamingChangeEventSource#getCdcTablesToQuery) and may lead to the connector using the schema of one capture instance while reading data from another which will result in a task failure.
Potential design issues
According to the documentation, a capture instance's start LSN is a:
Log sequence number (LSN) representing the low endpoint for querying the change table.
It doesn't determine the order in which multiple capture instances were created and/or should be used but the connector uses the start LSN for ordering capture instances. A CDC cleanup job may update the start LSN of multiple co-existing capture instances.
Additionally (not strictly related but worth mentioning) the end LSN is an:
LSN representing the high endpoint for querying the change table. In SQL Server 2012 (11.x), this column is always NULL.
It looks like the connector shouldn't use the end LSN but it does.
Technical details
Given the actual semantics of the start LSN, after a CDC cleanup job completes, it's very much likely that both capture instances of a table will contain the same start LSN which will cause the above issue.