-
Enhancement
-
Resolution: Done
-
Major
-
2.4.0.Final
What Debezium connector do you use and what version?
MySQL 2.4.0
What is the connector configuration?
```
...
include.schema.changes: true
incremental.snapshot.chunk.size: 10240
...
max.batch.size: 40960
max.queue.size: 81920
message.key.columns: redacted.my_problem_table:customer_id,field_id
offset.flush.timeout.ms: 60000
offset.flush.interval.ms: 10000
read.only: true
...
signal.consumer.sasl.jaas.config: "redacted"
signal.consumer.sasl.mechanism: SCRAM-SHA-512
signal.consumer.security.protocol: SASL_SSL
signal.data.collection: redacted.debezium_signal
signal.kafka.bootstrap.servers: strimzi-kafka-bootstrap.staging.redacted.com:9096
signal.kafka.topic: debezium.sdlc-redacted-pxc.redacted.debezium_signal
...
topic.creation.groups: my_problem_table
topic.creation.my_problem_table.include: debezium.sdlc-redacted-pxc.redacted.my_problem_table
topic.creation.my_problem_table.delete.retention.ms: 604800000 # 1 week deletion retention on tombstones
topic.creation.my_problem_table.cleanup.policy: compact
topic.creation.my_problem_table.partitions: 36
topic.creation.my_problem_table.unclean.leader.election.enable: false
```
What is the captured database version and mode of depoyment?
MySQL 8.0 Percona XtraDB (PXC) cluster
What behaviour do you expect?
If the `SELECT * ... LIMIT 1` query takes longer than it should, the connector should fail gracefully and report the failure in the logs.
What behaviour do you see?
The connector seems to continue to wait indefinitely. After a couple of minutes the connector will log a Keepalive INFO log that it lost connection. The query hangs and doesn't return.
The `SELECT` statement that comes through shows it has `message.key.columns` out of order:
```
SELECT * FROM redacted.my_problem_table ORDER BY field_id DESC, customer_id DESC LIMIT 1
```
No other signals make it into the topic, making the snapshot impossible to stop. I have TRACE logging enabled and there's nothing in them that might give a clue as to what the connector is trying to do. An indefinite amount of time will pass, and eventually the signals will all come through to the topic, and from then on, the latest signal will log every two hours. This doesn't seem to trigger additional snapshots, but I'm guessing it's also not expected.
Do you see the same behaviour using the latest relesead Debezium version?
I've tried on 2.4.0 and 2.2.0
Do you have the connector logs, ideally from start till finish?
```
{{2024-03-07 22:05:34,378 INFO [debezium.sdlc-redacted-pxc|task-0] Requested 'INCREMENTAL' snapshot of data collections '[redacted.my_problem_table]' with additional conditions '[]' and surrogate key 'PK of table will be used' (io.debezium.pipeline.signal.actions.snapshotting.ExecuteSnapshot) [blc-redacted-pxc-nomi.staging.redacted.com:3306]2024-03-07 22:05:34,379 DEBUG [debezium.sdlc-redacted-pxc|task-0] Starting new chunk with id '98e89c03-363c-487e-bf60-5f10f0e0e084' (io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotContext) [blc-redacted-pxc-nomi.staging.redacted.com:3306]2024-03-07 22:05:34,806 DEBUG [debezium.sdlc-redacted-pxc|task-0] Sending 1 records to topic debezium.sdlc-redacted-pxc.redacted.debezium_signal (io.debezium.connector.common.BaseSourceTask) [task-thread-debezium.sdlc-redacted-pxc-0]2024-03-07 22:05:34,806 INFO [debezium.sdlc-redacted-pxc|task-0] 1 records sent during previous 02:05:14.948, last recorded offset of
{server=debezium.sdlc-redacted-pxc} partition is {transaction_id=null, ts_sec=1709838535, file=mysqld-bin.000019, pos=316, incremental_snapshot_signal_offset=null, gtids=35237b5f-cb86-11ee-bf0c-aee5d9a33803:1-34,bb70456c-c5e1-11ee-b22e-9a39132b5563:1-6, row=1, server_id=317581, event=2} (io.debezium.connector.common.BaseSourceTask) [task-thread-debezium.sdlc-redacted-pxc-0]2024-03-07 22:05:34,848 INFO [debezium.sdlc-redacted-pxc|task-0] The task will send records to topic 'debezium.sdlc-redacted-pxc.redacted.debezium_signal' for the first time. Checking whether topic exists (org.apache.kafka.connect.runtime.AbstractWorkerSourceTask) [task-thread-debezium.sdlc-redacted-pxc-0]2024-03-07 22:05:34,849 INFO [debezium.sdlc-redacted-pxc|task-0] Topic 'debezium.sdlc-redacted-pxc.redacted.debezium_signal' already exists. (org.apache.kafka.connect.runtime.AbstractWorkerSourceTask) [task-thread-debezium.sdlc-redacted-pxc-0]2024-03-07 22:06:19,907 DEBUG [debezium.sdlc-redacted-pxc|task-0|offsets] Committing offset '{server=debezium.sdlc-redacted-pxc}' for partition '
{transaction_id=null, ts_sec=1709838535, file=mysqld-bin.000019, pos=316, incremental_snapshot_signal_offset=null, gtids=35237b5f-cb86-11ee-bf0c-aee5d9a33803:1-34,bb70456c-c5e1-11ee-b22e-9a39132b5563:1-6, row=1, server_id=317581, event=2}' (io.debezium.connector.common.BaseSourceTask) [SourceTaskOffsetCommitter-1]Mar 07, 2024 10:07:34 PM com.github.shyiko.mysql.binlog.BinaryLogClient$5 runINFO: Keepalive: Trying to restore lost connection to redacted-pxc-nomi.staging.redacted.com:3306}}
```
How to reproduce the issue using our tutorial deployment?
There are two issues at play. I'll open a bug issue for the second and link it here. The connector is forming its `SELECT` query putting `message.key.columns` out of order, and effectively bypassing the index the configured key creates. Here's an example schema of the problem table:
```
CREATE TABLE `my_problem_table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`item_id` int(11) unsigned NOT NULL,
`field_id` tinyint(3) unsigned NOT NULL,
`val` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
`customer_id` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `idx_my_problem_table_coll_id_field_id` (`customer_id`,`field_id`),
KEY `idx_coll_notes_uid_release` (`uid`,`item_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
```
- is caused by
-
DBZ-7617 Incremental snapshot query doesn't honor message.key.columns order
- Closed
- links to
-
RHEA-2024:139598 Red Hat build of Debezium 2.5.4 release