Loading...

Details

Type: Bug
Resolution: Not a Bug
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: postgresql-connector
Labels:
None

Blocked:
False
Ready:
False
Release Note Text:
Undefined

SFDC Cases Counter:
SFDC Cases Links:

Description

It seems like snapshot fails when the record size exceeds the limit. Consequently Debezium is not working properly anymore. Restarting the container is then needed and a new snapshot has to start all over again.

Setting both CONNECT_PRODUCER_MAX_REQUEST_SIZE and CONNECT_MESSAGE_MAX_BYTES to something slightly higher than our biggest message solved the issue.

It ran with Kubernetes on Openshift and Rancher. The issue occurred on both environment.

The logs:

14/10/2020 20:40:26 org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.
14/10/2020 20:40:26 2020-10-14 18:40:26,685 INFO   ||  WorkerSourceTask{id=***-0} Committing offsets   [org.apache.kafka.connect.runtime.WorkerSourceTask]
14/10/2020 20:40:26 2020-10-14 18:40:26,685 INFO   ||  WorkerSourceTask{id=***-0} flushing 1319 outstanding messages for offset commit   [org.apache.kafka.connect.runtime.WorkerSourceTask]
14/10/2020 20:40:28 2020-10-14 18:40:28,385 ERROR  ||  WorkerSourceTask{id=***-0} failed to send record to ***:    [org.apache.kafka.connect.runtime.WorkerSourceTask]
14/10/2020 20:40:28 org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.
14/10/2020 20:40:29 2020-10-14 18:40:29,253 ERROR  ||  WorkerSourceTask{id=***-0} failed to send record to ***:    [org.apache.kafka.connect.runtime.WorkerSourceTask]
14/10/2020 20:40:29 org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.
14/10/2020 20:40:31 2020-10-14 18:40:31,685 ERROR  ||  WorkerSourceTask{id=***-0} Failed to flush, timed out while waiting for producer to flush outstanding 3 messages   [org.apache.kafka.connect.runtime.WorkerSourceTask]
14/10/2020 20:40:31 2020-10-14 18:40:31,686 ERROR  ||  WorkerSourceTask{id=***-0} Task threw an uncaught and unrecoverable exception   [org.apache.kafka.connect.runtime.WorkerTask]
14/10/2020 20:40:31 org.apache.kafka.connect.errors.ConnectException: Unrecoverable exception from producer send callback
14/10/2020 20:40:31    at org.apache.kafka.connect.runtime.WorkerSourceTask.maybeThrowProducerSendException(WorkerSourceTask.java:282)
14/10/2020 20:40:31    at org.apache.kafka.connect.runtime.WorkerSourceTask.sendRecords(WorkerSourceTask.java:336)
14/10/2020 20:40:31    at org.apache.kafka.connect.runtime.WorkerSourceTask.execute(WorkerSourceTask.java:264)
14/10/2020 20:40:31    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:185)
14/10/2020 20:40:31    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:235)
14/10/2020 20:40:31    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
14/10/2020 20:40:31    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
14/10/2020 20:40:31    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
14/10/2020 20:40:31    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
14/10/2020 20:40:31    at java.base/java.lang.Thread.run(Thread.java:834)
14/10/2020 20:40:31 Caused by: org.apache.kafka.common.errors.RecordTooLargeException: The request included a message larger than the max message size the server will accept.
14/10/2020 20:40:31 2020-10-14 18:40:31,686 ERROR  ||  WorkerSourceTask{id=***-0} Task is being killed and will not recover until manually restarted   [org.apache.kafka.connect.runtime.WorkerTask]
14/10/2020 20:40:31 2020-10-14 18:40:31,686 INFO   ||  Stopping down connector   [io.debezium.connector.common.BaseSourceTask]
14/10/2020 20:40:31 2020-10-14 18:40:31,689 WARN   Postgres|staging|postgres-connector-task  Snapshot was interrupted before completion   [io.debezium.pipeline.source.AbstractSnapshotChangeEventSource]
14/10/2020 20:40:31 2020-10-14 18:40:31,689 INFO   Postgres|staging|postgres-connector-task  Snapshot - Final stage   [io.debezium.pipeline.source.AbstractSnapshotChangeEventSource]
14/10/2020 20:40:31 2020-10-14 18:40:31,689 WARN   Postgres|staging|postgres-connector-task  Change event source executor was interrupted   [io.debezium.pipeline.ChangeEventSourceCoordinator]
14/10/2020 20:40:31 java.lang.InterruptedException: Interrupted while snapshotting table ***
14/10/2020 20:40:31    at io.debezium.relational.RelationalSnapshotChangeEventSource.createDataEventsForTable(RelationalSnapshotChangeEventSource.java:329)
14/10/2020 20:40:31    at io.debezium.relational.RelationalSnapshotChangeEventSource.createDataEvents(RelationalSnapshotChangeEventSource.java:286)
14/10/2020 20:40:31    at io.debezium.relational.RelationalSnapshotChangeEventSource.doExecute(RelationalSnapshotChangeEventSource.java:138)
14/10/2020 20:40:31    at io.debezium.pipeline.source.AbstractSnapshotChangeEventSource.execute(AbstractSnapshotChangeEventSource.java:63)
14/10/2020 20:40:31    at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:105)
14/10/2020 20:40:31    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
14/10/2020 20:40:31    at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
14/10/2020 20:40:31    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
14/10/2020 20:40:31    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
14/10/2020 20:40:31    at java.base/java.lang.Thread.run(Thread.java:834)
14/10/2020 20:40:31 2020-10-14 18:40:31,690 INFO   Postgres|staging|postgres-connector-task  Connected metrics set to 'false'   [io.debezium.pipeline.metrics.StreamingChangeEventSourceMetrics]
14/10/2020 20:40:31 2020-10-14 18:40:31,692 INFO   ||  [Producer clientId=connector-producer-***-0] Closing the Kafka producer with timeoutMillis = 30000 ms.   [org.apache.kafka.clients.producer.KafkaProducer]
14/10/2020 20:41:02 2020-10-14 18:41:02,411 INFO   ||  WorkerSourceTask{id=***-0} Committing offsets   [org.apache.kafka.connect.runtime.WorkerSourceTask]
14/10/2020 20:41:02 2020-10-14 18:41:02,412 INFO   ||  WorkerSourceTask{id=***-0} flushing 3 outstanding messages for offset commit   [org.apache.kafka.connect.runtime.WorkerSourceTask]
14/10/2020 20:41:07 2020-10-14 18:41:07,412 ERROR  ||  WorkerSourceTask{id=***-0} Failed to flush, timed out while waiting for producer to flush outstanding 3 messages   [org.apache.kafka.connect.runtime.WorkerSourceTask]
14/10/2020 20:41:07 2020-10-14 18:41:07,412 ERROR  ||  WorkerSourceTask{id=***-0} Failed to commit offsets   [org.apache.kafka.connect.runtime.SourceTaskOffsetCommitter]

Snapshot fails if message size exceeds max.request.size

Details

Description

Attachments

Activity

People

Dates