Loading...

XML

Word

Printable

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 0.10.0.Beta1
Affects Version/s: 0.9.1.Final
Component/s: sqlserver-connector
Labels:
None

Git Pull Request:
https://github.com/debezium/debezium/pull/915

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

If the primary key is updated, there are two corresponding entries in the capture instance - delete and insert, both with exactly the same _$start_lsn and _$seqval.

create table users (name varchar(50) NOT NULL PRIMARY KEY, age int NOT NULL);

exec sys.sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'users', @role_name = NULL, @supports_net_changes = 0;

insert into users values ("bar", 123);

-- start connector.

update users set name = "newbar" where name = "bar";

__$start_lsn           __$end_lsn             __$seqval              __$operation __$update_mask                                                                                                                                                                                                                                                     name                                               age         __$command_id
---------------------- ---------------------- ---------------------- ------------ ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ -------------------------------------------------- ----------- -------------
0x00000026000008F00006 NULL                   0x00000026000008F00002            1 0x03                                                                                                                                                                                                                                                               bar                                                        111             1
0x00000026000008F00006 NULL                   0x00000026000008F00002            2 0x03                                                                                                                                                                                                                                                               newbar                                                     111             2

(2 rows affected)

SQL Server connector keeps track of (commitLsn, changeLsn) as offsets.
It may happen that before connector crash only the first records has been put into kafka and its offset has been remembered. In such case the second record is skipped after a restart.

EDIT: In fact more CDC logs can be lost. If more than one primary key is modified in a transaction, then the order of CDC logs is as follows:

select [__$start_lsn], [__$seqval], [__$operation] from cdc.fn_cdc_get_all_changes_dbo_capture_instance(..., ..., 'all update old');
__$start_lsn           __$seqval              __$operation                 
---------------------- ---------------------- ------------ -------------
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0021            1
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0025            1
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0028            1
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB002B            1
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB002E            1
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0031            1
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0021            2
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0025            2
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0028            2
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB002B            2
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB002E            2
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0031            2

Please note, that records are ordered by 1) commitLsn 2) operation 3) changeLsn.

It may happen, that a part of records with `operation=2` may be lost. Example scenario:

Kafka Connect has processed the first 4 records. The offset saved is `0x0006C5C1009E39CB0125 0x0006C5C1009E38DB002B`
Kafka Connect is restarted.

Kafka Connect skips all records with offset lower or equal to `0x0006C5C1009E39CB0125 0x0006C5C1009E38DB002B`. In consequence the following rows will be skipped, but they should not be!

0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0021            2
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0025            2
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB0028            2
0x0006C5C1009E39CB0125 0x0006C5C1009E38DB002B            2

is related to

DBZ-1483 Add column_id column to metadata section in messages in Kafka topic

Closed

Assignee:: Jiri Pechanec

Reporter:: Grzegorz Kołakowski (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2019/02/18 12:45 PM

Updated:: 2023/01/24 3:42 PM

Resolved:: 2019/06/11 7:17 AM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates