Type: Enhancement
Resolution: Done
Priority: Major
Fix Version/s: 2.6.0.Beta1
Affects Version/s: 1.9.0.Final
Component/s: mysql-connector, postgresql-connector
Labels:
None

Blocked:
False
Blocked Reason:
None
Ready:
False

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

What is the captured database version and mode of deployment?

(E.g. on-premises, with a specific cloud provider, etc.)

Running Debezium Server inside K8S and connected to PG14 database.

What behaviour do you expect?

Incremental snapshot queries should perform better when having multi-column PK on the table.

What behaviour do you see?

The query created by debezium for incremental snapshot perform bad.

Do you see the same behaviour using the latest releasead Debezium version?

(Ideally, also verify with latest Alpha/Beta/CR version)

Using Debezium 1.9.0.Final

Do you have the connector logs, ideally from start till finish?

(You might be asked later to provide DEBUG/TRACE level log)

N/A

How to reproduce the issue using our tutorial deployment?

Have this table DDL

create table "user"
(
    id                   serial
        unique,
    account_id           integer not null
        references account,
    merged_with          integer,
    external_id          text,
    email                text,
    created_datetime     timestamp,
    updated_datetime     timestamp,
    activated_datetime   timestamp,
    deactivated_datetime timestamp,
    deleted_datetime     timestamp,
    primary key (id, account_id),
    foreign key (merged_with, account_id) references "user"
);

Try to run an incremental snapshot when this table has a lot of records, say 80mil or even more.

An example of EXPLAIN ANALYZE from my experience

EXPLAIN ANALYZE SELECT * FROM public.user
WHERE (("id" > 247000000) OR ("id" = 247000000 AND "account_id" > 15000))
AND NOT (("id" > 495463994) OR ("id" = 495463994 AND "account_id" > 33530))
ORDER BY "id", "account_id" LIMIT 10000;
                                                                              QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1015.03..74525.87 rows=10000 width=672) (actual time=114042.986..114605.335 rows=1 loops=1)
   ->  Gather Merge  (cost=1015.03..3043187.62 rows=413840 width=672) (actual time=114042.984..114605.332 rows=1 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Incremental Sort  (cost=15.00..2994420.19 rows=206920 width=672) (actual time=114038.991..114038.992 rows=0 loops=3)
               Sort Key: id, account_id
               Presorted Key: id
               Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
               Worker 0:  Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
               Worker 1:  Full-sort Groups: 1  Sort Method: quicksort  Average Memory: 25kB  Peak Memory: 25kB
               ->  Parallel Index Scan using user_id_key on "user"  (cost=0.57..2985108.79 rows=206920 width=672) (actual time=114038.853..114038.853 rows=0 loops=3)
                     Index Cond: (id <= 495463994)
                     Filter: (((id <> 495463994) OR (account_id <= 33530)) AND ((id > 247000000) OR ((id = 247000000) AND (account_id > 15000))))
                     Rows Removed by Filter: 25854999
 Planning Time: 1.180 ms
 Execution Time: 114605.492 ms

Also, community chat discussion is here.

Feature request or enhancement

For feature requests or enhancements, provide this information, please:

Which use case/requirement will be addressed by the proposed feature?

Improve performance of incremental snapshot when snapshotting tables with multi-column PK.

Implementation ideas (optional)

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

Screenshot 2022-05-10 at 11.53.10.png
443 kB
2022/05/10 9:53 AM
Screenshot 2022-05-10 at 11.53.10-1.png
443 kB
2022/05/10 9:57 AM

account is impacted by

DBZ-6023 Add support for surrogate keys for incremental snapshots

Closed

links to

RHEA-2024:139598 Red Hat build of Debezium 2.5.4 release

Assignee:: Unassigned

Reporter:: Tudor Plugaru (Inactive)

Votes:: 1 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2022/04/29 12:50 PM

Updated:: 2024/10/09 12:21 PM

Resolved:: 2024/02/29 12:37 PM

Details

Description

What is the captured database version and mode of deployment?

What behaviour do you expect?

What behaviour do you see?

Do you see the same behaviour using the latest releasead Debezium version?

Do you have the connector logs, ideally from start till finish?

How to reproduce the issue using our tutorial deployment?

Feature request or enhancement

Which use case/requirement will be addressed by the proposed feature?

Implementation ideas (optional)

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

Hide