-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
False
-
None
-
False
What Debezium connector do you use and what version?
debezium-connector-postgres 3.0.4.Final
What is the connector configuration?
{ "name": "connector_name", "connector.class": "io.debezium.connector.postgresql.PostgresConnector", "plugin.name" : "pgoutput", "tasks.max" : "1", "slot.name" : "slot_name", "publication.name": "publication_name", "publication.autocreate.mode" = "disabled", "topic.prefix": "prefix", "table.include.list": "table1,table2", "snapshot.mode" : "never", "signal.data.collection": "signals_table", "database.sslmode": "require", "database.hostname": "db-host-name", "database.port": "5432", "database.dbname": "db_name", "database.user": "user", "database.password": "password", "key.converter" : "io.confluent.connect.avro.AvroConverter", "key.converter.enhanced.avro.schema.support": true "key.converter.schema.registry.url": "schema-registyr-url", "key.converter.basic.auth.credentials.source": "USER_INFO" "key.converter.basic.auth.user.info": "key:pass", "value.converter" : "io.confluent.connect.avro.AvroConverter", "value.converter.enhanced.avro.schema.support": true, "value.converter.schema.registry.url": "schema-registry-url" "value.converter.basic.auth.credentials.source" = "USER_INFO", "value.converter.basic.auth.user.info": "key:pass", "heartbeat.interval.ms": "60000", "topic.heartbeat.prefix": "prefix", "incremental.snapshot.chunk.size": "4000", "column.exclude.list": "somecolumns", "skipped.operations": "t", "errors.tolerance" : "none", "errors.log.enable": "true" }
What is the captured database version and mode of deployment?
AWS RDS PostgreSQL 14
What behavior do you expect?
When there is a connection or server-side non-recoverable error, Debezium should log the error, and mark the task as failed.
What behavior do you see?
Due to what we assume is a DB server side bug, Debezium couldn't start streaming data from PostgreSQL. The error was:
Producer failure org.postgresql.util.PSQLException: ERROR: could not create file "pg_replslot/slot_name/state.tmp": File exists Where: slot "slot_name", output plugin "pgoutput", in the change callback, associated LSN AAA/ABC12345 at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2733) at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1311) at org.postgresql.core.v3.QueryExecutorImpl.readFromCopy(QueryExecutorImpl.java:1210) at org.postgresql.core.v3.CopyDualImpl.readFromCopy(CopyDualImpl.java:49) at org.postgresql.core.v3.replication.V3PGReplicationStream.receiveNextData(V3PGReplicationStream.java:163) at org.postgresql.core.v3.replication.V3PGReplicationStream.readInternal(V3PGReplicationStream.java:128) at org.postgresql.core.v3.replication.V3PGReplicationStream.readPending(V3PGReplicationStream.java:85) at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$1.readPending(PostgresReplicationConnection.java:663) at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.processMessages(PostgresStreamingChangeEventSource.java:217) at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:179) at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:42) at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:324) at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:203) at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:143) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840)
Debezium continued trying to connect to the DB without success, and the task was never reported as "failed". Due to that reason we didn't identify the problem on time (we monitor failed tasks), and the DB filled up with WAL, and stopped working.
After a DB server restart the problem was gone, and Debezium connected successfully.
Do you see the same behaviour using the latest released Debezium version?
I couldn't test it as the DB problem was solved after a restart, and we don't know how to reproduce that problem.
Do you have the connector logs, ideally from start till finish?
Unfortunately the only log I can share is the one above.
How to reproduce the issue using our tutorial deployment?
I don't know how to reproduce the problem on the DB server, which would cause the problem on Debezium side.