Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-4189

Incremental snapshot skips rows for composite PK

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • None
    • 1.7.0.Final
    • None
    • False
    • False
    • Hide

      create table n_n_repro(id1 int, id2 int, constraint pk_n_n_repro primary key (id1, id2));
      insert into n_n_repro values (1,1);
      insert into n_n_repro values (1,2);
      insert into n_n_repro values (2,3);
      insert into n_n_repro values (3,2); --this one is skipped
      insert into n_n_repro values (3,4);
      insert into n_n_repro values (4,5);

       

      start an incremental snapshot with incremental.snapshot.chunk.size=1

      Show
      create table n_n_repro(id1 int, id2 int, constraint pk_n_n_repro primary key (id1, id2)); insert into n_n_repro values (1,1); insert into n_n_repro values (1,2); insert into n_n_repro values (2,3); insert into n_n_repro values (3,2); --this one is skipped insert into n_n_repro values (3,4); insert into n_n_repro values (4,5);   start an incremental snapshot with incremental.snapshot.chunk.size=1

      We have been investigating an issue where we are missing multiple IDs after snapshoting a N-N table in SQL Server and this looks like a mean corner case for the incremental snapshot solution

      This particular N-N table has no IDENTITY column and the composite PK starts with the ID that increases the least.

      One ID that wasn't picked up during snapshoting is 682960468

      image_1 shows immediately preceeding IDs, notice how the preceeding one has a higher request_id but lower notifyresponse_id

      Using SQL Server profiler I got the query that is executed by the snapshotting process and simulated it starting from the preceeding ID. Notice 682960468 is skipped entirely

      image_3 and 4 is me walking through the offsets in Kafka and confirming exactly that, right after 682960467 it skips to 682960474

       

      Tagged this as SQL Server but if a similar query is executed on other connectors it should apply to them as well

        1. image_1.png
          35 kB
          Thiago Dantas
        2. image_2.png
          46 kB
          Thiago Dantas
        3. image_3.png
          72 kB
          Thiago Dantas
        4. image_4.png
          71 kB
          Thiago Dantas

              Unassigned Unassigned
              danthimself Thiago Dantas (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: