-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
False
-
None
-
False
Bug report
I have troubles with incremental snapshot on PostgreSQL with Debezium 2.7.0.Final.
I'm trying to read a big table with ~320 million rows. After ~8-12 hours of running incremental snapshot on this table a StackOverflowError occured:
2024-07-02 16:36:48,558 INFO || WorkerSourceTask{id=lud_prod_20240628_2107-0} Committing offsets for 203544 acknowledged messages [org.apache.kafka.connect.runtime.WorkerSourceTask] 2024-07-02 16:36:52,087 ERROR Postgres|lud_prod|streaming Producer failure [io.debezium.pipeline.ErrorHandler] java.lang.StackOverflowError at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4807) at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4752) at java.base/java.util.regex.Pattern$BranchConn.match(Pattern.java:4716) at java.base/java.util.regex.Pattern$GroupTail.match(Pattern.java:4866) at java.base/java.util.regex.Pattern$BmpCharPropertyGreedy.match(Pattern.java:4347) at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4807) at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4752) at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4750) at java.base/java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3967) at java.base/java.util.regex.Pattern$Start.match(Pattern.java:3622) at java.base/java.util.regex.Matcher.search(Matcher.java:1729) at java.base/java.util.regex.Matcher.find(Matcher.java:773) at java.base/java.util.Formatter.parse(Formatter.java:2702) at java.base/java.util.Formatter.format(Formatter.java:2655) at java.base/java.util.Formatter.format(Formatter.java:2609) at java.base/java.lang.String.format(String.java:2897) at io.debezium.pipeline.source.snapshot.incremental.SignalMetadata.metadataString(SignalMetadata.java:26) at io.debezium.pipeline.source.snapshot.incremental.SignalBasedIncrementalSnapshotChangeEventSource.lambda$emitWindowOpen$0(SignalBasedIncrementalSnapshotChangeEventSource.java:73) at io.debezium.jdbc.JdbcConnection.prepareUpdate(JdbcConnection.java:771) at io.debezium.pipeline.source.snapshot.incremental.SignalBasedIncrementalSnapshotChangeEventSource.emitWindowOpen(SignalBasedIncrementalSnapshotChangeEventSource.java:69) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.readChunk(AbstractIncrementalSnapshotChangeEventSource.java:258) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.closeWindow(AbstractIncrementalSnapshotChangeEventSource.java:114) at io.debezium.pipeline.source.snapshot.incremental.DeleteWindowCloser.closeWindow(DeleteWindowCloser.java:44) at io.debezium.pipeline.source.snapshot.incremental.SignalBasedIncrementalSnapshotChangeEventSource.emitWindowClose(SignalBasedIncrementalSnapshotChangeEventSource.java:85) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.readChunk(AbstractIncrementalSnapshotChangeEventSource.java:341) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.closeWindow(AbstractIncrementalSnapshotChangeEventSource.java:114) at io.debezium.pipeline.source.snapshot.incremental.DeleteWindowCloser.closeWindow(DeleteWindowCloser.java:44) at io.debezium.pipeline.source.snapshot.incremental.SignalBasedIncrementalSnapshotChangeEventSource.emitWindowClose(SignalBasedIncrementalSnapshotChangeEventSource.java:85) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.readChunk(AbstractIncrementalSnapshotChangeEventSource.java:341) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.closeWindow(AbstractIncrementalSnapshotChangeEventSource.java:114) at io.debezium.pipeline.source.snapshot.incremental.DeleteWindowCloser.closeWindow(DeleteWindowCloser.java:44) at io.debezium.pipeline.source.snapshot.incremental.SignalBasedIncrementalSnapshotChangeEventSource.emitWindowClose(SignalBasedIncrementalSnapshotChangeEventSource.java:85) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.readChunk(AbstractIncrementalSnapshotChangeEventSource.java:341) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.closeWindow(AbstractIncrementalSnapshotChangeEventSource.java:114) at io.debezium.pipeline.source.snapshot.incremental.DeleteWindowCloser.closeWindow(DeleteWindowCloser.java:44) at io.debezium.pipeline.source.snapshot.incremental.SignalBasedIncrementalSnapshotChangeEventSource.emitWindowClose(SignalBasedIncrementalSnapshotChangeEventSource.java:85) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.readChunk(AbstractIncrementalSnapshotChangeEventSource.java:341) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.closeWindow(AbstractIncrementalSnapshotChangeEventSource.java:114) . . . at io.debezium.pipeline.source.snapshot.incremental.SignalBasedIncrementalSnapshotChangeEventSource.emitWindowClose(SignalBasedIncrementalSnapshotChangeEventSource.java:85) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.readChunk(AbstractIncrementalSnapshotChangeEventSource.java:341) at io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource.closeWindow(AbstractIncrementalSnapshotChangeEventSource.java:114) at io.debezium.pipeline.source.snapshot.incremental.DeleteWindowCloser.closeWindow(DeleteWindowCloser.java:44) at io.debezium.pipeline.source.snapshot.incremental.SignalBasedIncrementalSnapshotChangeEventSource.emitWindowClose(SignalBasedIncrementalSnapshotChangeEventSource.java:85) 2024-07-02 16:36:52,090 INFO Postgres|lud_prod|streaming Connected metrics set to 'false' [io.debezium.pipeline.ChangeEventSourceCoordinator] 2024-07-02 16:38:03,468 INFO || WorkerSourceTask{id=lud_prod_20240628_2107-0} Committing offsets for 221205 acknowledged messages [org.apache.kafka.connect.runtime.WorkerSourceTask] 2024-07-02 16:38:03,476 ERROR || WorkerSourceTask{id=lud_prod_20240628_2107-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted [org.apache.kafka.connect.runtime.WorkerTask]
What Debezium connector do you use and what version?
I'm using 2.7.0.Final version.
What is the connector configuration?
Connector configuration:
{ "name": "lud_prod", "connector.class": "io.debezium.connector.postgresql.PostgresConnector", "plugin.name": "pgoutput", "publication.autocreate.mode": "disabled", "publication.name": "dbz_publication", "slot.name": "debezium_cdc", "heartbeat.interval.ms": 500, "heartbeat.action.query": "UPDATE dbz.heartbeat SET update_dt = now()", "signal.enabled.channels": "source", "signal.data.collection": "dbz.signal", "snapshot.mode": "never", "database.hostname": "some_host", "database.port": "5432", "database.user": "debezium", "database.password": "####", "database.dbname": "lud", "database.server.name": "lud_prod", "driver.connectTimeout": 0, "driver.socketTimeout": 0, "driver.tcpKeepAlive": true, "schema.include.list": "lud,dbz", "skipped.operations": "none", "tombstones.on.delete": "false", "topic.prefix": "lud_prod", "topic.creation.default.replication.factor": 1, "topic.creation.default.partitions": 10, "topic.creation.default.cleanup.policy": "compact", "topic.creation.default.compression.type": "lz4", "topic.creation.default.max.message.bytes": 1073741824, "decimal.handling.mode": "double", "time.precision.mode": "connect", "key.converter": "io.confluent.connect.avro.AvroConverter", "key.converter.schemas.enable": "true", "key.converter.schema.registry.url": "http://redpanda-1:8081,http://redpanda-2:8081,http://redpanda-3:8081,http://redpanda-4:8081,http://redpanda-5:8081", "key.converter.id.compatibility.strict": "false", "value.converter": "io.confluent.connect.avro.AvroConverter", "value.converter.schemas.enable": "true", "value.converter.schema.registry.url": "http://redpanda-1:8081,http://redpanda-2:8081,http://redpanda-3:8081,http://redpanda-4:8081,http://redpanda-5:8081", "value.converter.id.compatibility.strict": "false", "producer.override.batch.size": 1638, "producer.override.max.request.size": 104857600, "flush.lsn.source": "true", "max.batch.size": 204800, "max.queue.size": 819200, "snapshot.fetch.size": 100000, "incremental.snapshot.chunk.size": 100000, "incremental.snapshot.watermarking.strategy": "insert_delete" }
Incremental snapshot config:
{"type":"incremental", "data-collections": ["lud.source"], "additional-conditions":[]}
What is the captured database version and mode of depoyment?
PostgreSQL 12.1 on x86_64-pc-linux-gnu, compiled by gcc (SUSE Linux) 4.8.5, 64-bit
What behaviour do you see?
After some period of time I'm having java.lang.StackOverflowError.
Do you see the same behaviour using the latest relesead Debezium version?
Yes.
Do you have the connector logs, ideally from start till finish?
How to reproduce the issue using our tutorial deployment?
It seems to me this is an unexpected behavior with incremental snapshot on big table.
- links to
-
RHEA-2024:139598 Red Hat build of Debezium 2.5.4 release