Details
-
Enhancement
-
Resolution: Unresolved
-
Minor
-
1.1.0.Final
-
None
Description
Hello!
While enabling CDC in SQL Server, query function(s) are automatically created to query the CDC tables. There exists two kind of function:
- One to queries for "all changes", which means that if a given key, for a given interval, happened to have undergone several changes, the function will return all those states.
- A second to queries for "net changes", which means that if a given key, for a given interval, happened to have undergone several changes, the function will return only the most recent state and not the intermediate states.
Currently Debezium use only the "all changes" function. (see SqlServerConnection.java > line 57, the definition of GET_ALL_CHANGES_FOR_TABLE ).
This enchancement request is to add the possiblity to let the user choose if Debezium queries SQL Server with the "all changes" or the "net changes". This could be done through a config. parameter ?
A first step would be to just add the support for "net change". A second step would be to add a second config. parameter to let the user choose the "row filter option". The SQL Server CDC functions have 3 parameters : two for the lower/high log number boundaries, and a third to control the metadata returned for each row.
For example, today, Debezium queries with the "all changes" function with "row filter option = 'all updates old'. But other filter option exists.
Please note that 'all updates old' doesn't exist for "net changes", so another default should be used for "net changes".
Here a discussion with gunnar.morling to help understand this Jira + a small post on the impact to add this to Debezium: https://github.com/debezium/debezium/pull/1354