[DBZ-7616] Provide query timeout property to avoid indefinitely hangs during queries - Red Hat Issue Tracker

Type: Enhancement
Resolution: Done
Priority: Major
Fix Version/s: 2.7.0.Alpha1
Affects Version/s: 2.4.0.Final
Component/s: core-library, mysql-connector
Labels:
- add-to-upgrade-guide

Blocked:
False
Blocked Reason:
None
Ready:
False
Git Pull Request:
https://github.com/debezium/debezium/pull/5483

Severity:
Critical

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

What Debezium connector do you use and what version?

MySQL 2.4.0

What is the connector configuration?

```

...
include.schema.changes: true
incremental.snapshot.chunk.size: 10240
...
max.batch.size: 40960
max.queue.size: 81920
message.key.columns: redacted.my_problem_table:customer_id,field_id
offset.flush.timeout.ms: 60000
offset.flush.interval.ms: 10000
read.only: true
...
signal.consumer.sasl.jaas.config: "redacted"
signal.consumer.sasl.mechanism: SCRAM-SHA-512
signal.consumer.security.protocol: SASL_SSL
signal.data.collection: redacted.debezium_signal
signal.kafka.bootstrap.servers: strimzi-kafka-bootstrap.staging.redacted.com:9096
signal.kafka.topic: debezium.sdlc-redacted-pxc.redacted.debezium_signal
...
topic.creation.groups: my_problem_table
topic.creation.my_problem_table.include: debezium.sdlc-redacted-pxc.redacted.my_problem_table
topic.creation.my_problem_table.delete.retention.ms: 604800000 # 1 week deletion retention on tombstones
topic.creation.my_problem_table.cleanup.policy: compact
topic.creation.my_problem_table.partitions: 36
topic.creation.my_problem_table.unclean.leader.election.enable: false

```

What is the captured database version and mode of depoyment?

MySQL 8.0 Percona XtraDB (PXC) cluster

What behaviour do you expect?

If the `SELECT * ... LIMIT 1` query takes longer than it should, the connector should fail gracefully and report the failure in the logs.

What behaviour do you see?

The connector seems to continue to wait indefinitely. After a couple of minutes the connector will log a Keepalive INFO log that it lost connection. The query hangs and doesn't return.

The `SELECT` statement that comes through shows it has `message.key.columns` out of order:

```

SELECT * FROM redacted.my_problem_table ORDER BY field_id DESC, customer_id DESC LIMIT 1

```

No other signals make it into the topic, making the snapshot impossible to stop. I have TRACE logging enabled and there's nothing in them that might give a clue as to what the connector is trying to do. An indefinite amount of time will pass, and eventually the signals will all come through to the topic, and from then on, the latest signal will log every two hours. This doesn't seem to trigger additional snapshots, but I'm guessing it's also not expected.

Do you see the same behaviour using the latest relesead Debezium version?

I've tried on 2.4.0 and 2.2.0

Do you have the connector logs, ideally from start till finish?

```

{{2024-03-07 22:05:34,378 INFO [debezium.sdlc-redacted-pxc|task-0] Requested 'INCREMENTAL' snapshot of data collections '[redacted.my_problem_table]' with additional conditions '[]' and surrogate key 'PK of table will be used' (io.debezium.pipeline.signal.actions.snapshotting.ExecuteSnapshot) [blc-redacted-pxc-nomi.staging.redacted.com:3306]2024-03-07 22:05:34,379 DEBUG [debezium.sdlc-redacted-pxc|task-0] Starting new chunk with id '98e89c03-363c-487e-bf60-5f10f0e0e084' (io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotContext) [blc-redacted-pxc-nomi.staging.redacted.com:3306]2024-03-07 22:05:34,806 DEBUG [debezium.sdlc-redacted-pxc|task-0] Sending 1 records to topic debezium.sdlc-redacted-pxc.redacted.debezium_signal (io.debezium.connector.common.BaseSourceTask) [task-thread-debezium.sdlc-redacted-pxc-0]2024-03-07 22:05:34,806 INFO [debezium.sdlc-redacted-pxc|task-0] 1 records sent during previous 02:05:14.948, last recorded offset of

{server=debezium.sdlc-redacted-pxc} partition is {transaction_id=null, ts_sec=1709838535, file=mysqld-bin.000019, pos=316, incremental_snapshot_signal_offset=null, gtids=35237b5f-cb86-11ee-bf0c-aee5d9a33803:1-34,bb70456c-c5e1-11ee-b22e-9a39132b5563:1-6, row=1, server_id=317581, event=2} (io.debezium.connector.common.BaseSourceTask) [task-thread-debezium.sdlc-redacted-pxc-0]2024-03-07 22:05:34,848 INFO [debezium.sdlc-redacted-pxc|task-0] The task will send records to topic 'debezium.sdlc-redacted-pxc.redacted.debezium_signal' for the first time. Checking whether topic exists (org.apache.kafka.connect.runtime.AbstractWorkerSourceTask) [task-thread-debezium.sdlc-redacted-pxc-0]2024-03-07 22:05:34,849 INFO [debezium.sdlc-redacted-pxc|task-0] Topic 'debezium.sdlc-redacted-pxc.redacted.debezium_signal' already exists. (org.apache.kafka.connect.runtime.AbstractWorkerSourceTask) [task-thread-debezium.sdlc-redacted-pxc-0]2024-03-07 22:06:19,907 DEBUG [debezium.sdlc-redacted-pxc|task-0|offsets] Committing offset '{server=debezium.sdlc-redacted-pxc}

' for partition '

{transaction_id=null, ts_sec=1709838535, file=mysqld-bin.000019, pos=316, incremental_snapshot_signal_offset=null, gtids=35237b5f-cb86-11ee-bf0c-aee5d9a33803:1-34,bb70456c-c5e1-11ee-b22e-9a39132b5563:1-6, row=1, server_id=317581, event=2}

' (io.debezium.connector.common.BaseSourceTask) [SourceTaskOffsetCommitter-1]Mar 07, 2024 10:07:34 PM com.github.shyiko.mysql.binlog.BinaryLogClient$5 runINFO: Keepalive: Trying to restore lost connection to redacted-pxc-nomi.staging.redacted.com:3306}}
```

How to reproduce the issue using our tutorial deployment?

There are two issues at play. I'll open a bug issue for the second and link it here. The connector is forming its `SELECT` query putting `message.key.columns` out of order, and effectively bypassing the index the configured key creates. Here's an example schema of the problem table:

```

CREATE TABLE `my_problem_table` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`item_id` int(11) unsigned NOT NULL,
`field_id` tinyint(3) unsigned NOT NULL,
`val` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL DEFAULT '',
`customer_id` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `idx_my_problem_table_coll_id_field_id` (`customer_id`,`field_id`),
KEY `idx_coll_notes_uid_release` (`uid`,`item_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

```

is caused by

DBZ-7617 Incremental snapshot query doesn't honor message.key.columns order

Closed

links to

RHEA-2024:139598 Red Hat build of Debezium 2.5.4 release

Details

Description

What Debezium connector do you use and what version?

What is the connector configuration?

What is the captured database version and mode of depoyment?

What behaviour do you expect?

What behaviour do you see?

Do you see the same behaviour using the latest relesead Debezium version?

Do you have the connector logs, ideally from start till finish?

How to reproduce the issue using our tutorial deployment?

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates

Hide