Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-4865

Debezium using Postgres connector stuck in infinite loop after auto restart on error

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • None
    • 1.8.1.Final
    • postgresql-connector
    • None
    • False
    • None
    • False
    • Hide

      Unfortunately I cannot reproduce this reliably.  I have this running in production, and it triggers after a few hours.

      Show
      Unfortunately I cannot reproduce this reliably.  I have this running in production, and it triggers after a few hours.

      I've run into an odd bug. I'm running Debezium 1.8.1 on Postgres 10.18 using plugin "pgoutput". This is running on Kubernetes using Strimzi Kafka. When I inserted a record into the signal table, the connector crashed. It's not reproducible every time, and I've seen this same error for other tables as well. The crash isn't the issue. When it restarts, debezium was stuck in an infinite loop.

      I've provided a simplified subset of the logs from the Debezium KafkaConnector:

      2022-03-15 18:14:58,880 ERROR Failed to read column metadata for 'public.debezium_signal' (io.debezium.connector.postgresql.connection.pgoutput.PgOutputMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:14:58,881 ERROR Producer failure (io.debezium.pipeline.ErrorHandler) [debezium-postgresconnector-primary-change-event-source-coordinator]
      org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
      	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:382)
      	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:490)
      	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:408)
      	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:329)
      	at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:315)
      	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:291)
      	at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:243)
      	at org.postgresql.jdbc.PgDatabaseMetaData.getColumns(PgDatabaseMetaData.java:1584)
      	at io.debezium.connector.postgresql.connection.pgoutput.PgOutputMessageDecoder.getTableColumnsFromDatabase(PgOutputMessageDecoder.java:352)
      	at io.debezium.connector.postgresql.connection.pgoutput.PgOutputMessageDecoder.handleRelationMessage(PgOutputMessageDecoder.java:291)
      	at io.debezium.connector.postgresql.connection.pgoutput.PgOutputMessageDecoder.processNotEmptyMessage(PgOutputMessageDecoder.java:185)
      	at io.debezium.connector.postgresql.connection.AbstractMessageDecoder.processMessage(AbstractMessageDecoder.java:33)
      	at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$1.deserializeMessages(PostgresReplicationConnection.java:490)
      	at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$1.readPending(PostgresReplicationConnection.java:482)
      	at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.processMessages(PostgresStreamingChangeEventSource.java:207)
      	at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:169)
      	at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:41)
      	at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:172)
      	at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:139)
      	at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:108)
      	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at java.base/java.lang.Thread.run(Thread.java:829)
      Caused by: java.io.EOFException
      	at org.postgresql.core.PGStream.receiveChar(PGStream.java:455)
      	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2119)
      	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:355)
      	... 24 more
      2022-03-15 18:14:58,881 INFO Connection gracefully closed (io.debezium.jdbc.JdbcConnection) [pool-78-thread-1]
      2022-03-15 18:14:58,884 INFO Connection gracefully closed (io.debezium.jdbc.JdbcConnection) [pool-79-thread-1]
      2022-03-15 18:14:58,884 INFO Finished streaming (io.debezium.pipeline.ChangeEventSourceCoordinator) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:14:58,884 INFO Connected metrics set to 'false' (io.debezium.pipeline.metrics.StreamingChangeEventSourceMetrics) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:14:59,051 WARN Going to restart connector after 10 sec. after a retriable exception (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:14:59,053 INFO Connection gracefully closed (io.debezium.jdbc.JdbcConnection) [pool-80-thread-1]
      2022-03-15 18:14:59,054 INFO Connection gracefully closed (io.debezium.jdbc.JdbcConnection) [pool-81-thread-1]
      2022-03-15 18:14:59,054 WARN WorkerSourceTask{id=postgres-connector-0} failed to poll records from SourceTask. Will retry operation. (org.apache.kafka.connect.runtime.WorkerSourceTask) [task-thread-postgres-connector-0]
      org.apache.kafka.connect.errors.RetriableException: An exception occurred in the change event producer. This connector will be restarted.
      	at io.debezium.pipeline.ErrorHandler.setProducerThrowable(ErrorHandler.java:38)
      	at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:172)
      	at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:41)
      	at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:172)
      	at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:139)
      	at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:108)
      	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      	at java.base/java.lang.Thread.run(Thread.java:829)
      Caused by: org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
      	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:382)
      	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:490)
      	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:408)
      	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:329)
      	at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:315)
      	at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:291)
      	at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:243)
      	at org.postgresql.jdbc.PgDatabaseMetaData.getColumns(PgDatabaseMetaData.java:1584)
      	at io.debezium.connector.postgresql.connection.pgoutput.PgOutputMessageDecoder.getTableColumnsFromDatabase(PgOutputMessageDecoder.java:352)
      	at io.debezium.connector.postgresql.connection.pgoutput.PgOutputMessageDecoder.handleRelationMessage(PgOutputMessageDecoder.java:291)
      	at io.debezium.connector.postgresql.connection.pgoutput.PgOutputMessageDecoder.processNotEmptyMessage(PgOutputMessageDecoder.java:185)
      	at io.debezium.connector.postgresql.connection.AbstractMessageDecoder.processMessage(AbstractMessageDecoder.java:33)
      	at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$1.deserializeMessages(PostgresReplicationConnection.java:490)
      	at io.debezium.connector.postgresql.connection.PostgresReplicationConnection$1.readPending(PostgresReplicationConnection.java:482)
      	at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.processMessages(PostgresStreamingChangeEventSource.java:207)
      	at io.debezium.connector.postgresql.PostgresStreamingChangeEventSource.execute(PostgresStreamingChangeEventSource.java:169)
      	... 9 more
      Caused by: java.io.EOFException
      	at org.postgresql.core.PGStream.receiveChar(PGStream.java:455)
      	at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2119)
      	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:355)
      	... 24 more
      
      2022-03-15 18:15:03,054 INFO Awaiting end of restart backoff period after a retriable error (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:05,055 INFO Awaiting end of restart backoff period after a retriable error (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:07,055 INFO Awaiting end of restart backoff period after a retriable error (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      
      2022-03-15 18:15:09,056 INFO Starting PostgresConnectorTask with configuration: (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    connector.class = io.debezium.connector.postgresql.PostgresConnector (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    database.dbname = xxxxxx (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    database.user = xxxxxx (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    incremental.snapshot.chunk.size = 2000000 (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    tasks.max = 1 (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    database.server.name = primary (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    heartbeat.interval.ms = 60000 (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    database.port = 5432 (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    plugin.name = pgoutput (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    value.converter.schema.registry.url = http://schema-registry-cp-schema-registry.kafka-production:8081 (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    task.class = io.debezium.connector.postgresql.PostgresConnectorTask (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    database.hostname = xxxxxx (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    database.password = ******** (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    signal.data.collection = public.debezium_signal (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    name = postgres-connector (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    table.include.list = public.debezium_signal,public.xxxxxx,public.xxxxxx,public.xxxxxx,public.xxxxxx[0-1]{1} (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    value.converter = io.confluent.connect.avro.AvroConverter (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    key.converter = io.confluent.connect.avro.AvroConverter (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    key.converter.schema.registry.url = http://schema-registry-cp-schema-registry.kafka-production:8081 (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,056 INFO    snapshot.mode = never (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,159 INFO Found previous partition offset io.debezium.connector.postgresql.PostgresPartition@959fc1ad: PostgresOffsetContext [sourceInfoSchema=Schema{io.debezium.connector.postgresql.Source:STRUCT}, sourceInfo=source_info[server='primary'db='xxxxxxx', lsn=LSN{1A4415/4A291E8}, txId=2044032803, lastCommitLsn=LSN{1A4415/4A291E8}, timestamp=2022-03-15T18:14:10.598122Z, snapshot=FALSE, schema=, table=], lastSnapshotRecord=false, lastCompletelyProcessedLsn=LSN{1A4415/4A29220}, lastCommitLsn=LSN{1A4415/4A291E8}, streamingStoppingLsn=null, transactionContext=TransactionContext [currentTransactionId=null, perTableEventCount={}, totalEventCount=0], incrementalSnapshotContext=IncrementalSnapshotContext [windowOpened=false, chunkEndPosition=null, dataCollectionsToSnapshot=[], lastEventKeySent=null, maximumKey=null]] (io.debezium.connector.common.BaseSourceTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,163 INFO user 'xxxxxx' connected to database 'xxxxxx' on PostgreSQL 10.18 (Ubuntu 10.18-1.pgdg18.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit with roles:
      	role 'xxxxxxx' [superuser: false, replication: true, inherit: true, create role: false, create db: false, can log in: true]
      	role 'read' [superuser: false, replication: false, inherit: true, create role: false, create db: false, can log in: false] (io.debezium.connector.postgresql.PostgresConnectorTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,166 INFO Obtained valid replication slot ReplicationSlot [active=false, latestFlushedLsn=LSN{1A4415/407B3B8}, catalogXmin=2043864902] (io.debezium.connector.postgresql.connection.PostgresConnection) [task-thread-postgres-connector-0]
      
      2022-03-15 18:15:09,167 INFO Found previous offset PostgresOffsetContext [sourceInfoSchema=Schema{io.debezium.connector.postgresql.Source:STRUCT}, sourceInfo=source_info[server='primary'db='xxxxx', lsn=LSN{1A4415/4A291E8}, txId=2044032803, lastCommitLsn=LSN{1A4415/4A291E8}, timestamp=2022-03-15T18:14:10.598122Z, snapshot=FALSE, schema=, table=], lastSnapshotRecord=false, lastCompletelyProcessedLsn=LSN{1A4415/4A29220}, lastCommitLsn=LSN{1A4415/4A291E8}, streamingStoppingLsn=null, transactionContext=TransactionContext [currentTransactionId=null, perTableEventCount={}, totalEventCount=0], incrementalSnapshotContext=IncrementalSnapshotContext [windowOpened=false, chunkEndPosition=null, dataCollectionsToSnapshot=[], lastEventKeySent=null, maximumKey=null]] (io.debezium.connector.postgresql.PostgresConnectorTask) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,167 INFO Snapshots are not allowed as per configuration, starting streaming logical changes only (io.debezium.connector.postgresql.snapshot.NeverSnapshotter) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,192 INFO Requested thread factory for connector PostgresConnector, id = primary named = change-event-source-coordinator (io.debezium.util.Threads) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,192 INFO Creating thread debezium-postgresconnector-primary-change-event-source-coordinator (io.debezium.util.Threads) [task-thread-postgres-connector-0]
      2022-03-15 18:15:09,193 INFO Metrics registered (io.debezium.pipeline.ChangeEventSourceCoordinator) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:09,193 INFO Context created (io.debezium.pipeline.ChangeEventSourceCoordinator) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:09,193 INFO According to the connector configuration no snapshot will be executed (io.debezium.connector.postgresql.PostgresSnapshotChangeEventSource) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:09,193 INFO Snapshot ended with SnapshotResult [status=SKIPPED, offset=PostgresOffsetContext [sourceInfoSchema=Schema{io.debezium.connector.postgresql.Source:STRUCT}, sourceInfo=source_info[server='primary'db='xxxxxxx', lsn=LSN{1A4415/4A291E8}, txId=2044032803, lastCommitLsn=LSN{1A4415/4A291E8}, timestamp=2022-03-15T18:14:10.598122Z, snapshot=FALSE, schema=, table=], lastSnapshotRecord=false, lastCompletelyProcessedLsn=LSN{1A4415/4A29220}, lastCommitLsn=LSN{1A4415/4A291E8}, streamingStoppingLsn=null, transactionContext=TransactionContext [currentTransactionId=null, perTableEventCount={}, totalEventCount=0], incrementalSnapshotContext=IncrementalSnapshotContext [windowOpened=false, chunkEndPosition=null, dataCollectionsToSnapshot=[], lastEventKeySent=null, maximumKey=null]]] (io.debezium.pipeline.ChangeEventSourceCoordinator) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:09,193 INFO Connected metrics set to 'true' (io.debezium.pipeline.metrics.StreamingChangeEventSourceMetrics) [debezium-postgresconnector-primary-change-event-source-coordinator]
      
      2022-03-15 18:15:09,285 INFO No incremental snapshot in progress, no action needed on start (io.debezium.pipeline.source.snapshot.incremental.AbstractIncrementalSnapshotChangeEventSource) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:09,285 INFO Starting streaming (io.debezium.pipeline.ChangeEventSourceCoordinator) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:09,285 INFO Retrieved latest position from stored offset 'LSN{1A4415/4A29220}' (io.debezium.connector.postgresql.PostgresStreamingChangeEventSource) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:09,285 INFO Looking for WAL restart position for last commit LSN 'LSN{1A4415/4A291E8}' and last change LSN 'LSN{1A4415/4A29220}' (io.debezium.connector.postgresql.connection.WalPositionLocator) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:09,285 INFO Initializing PgOutput logical decoder publication (io.debezium.connector.postgresql.connection.PostgresReplicationConnection) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:09,338 INFO Obtained valid replication slot ReplicationSlot [active=false, latestFlushedLsn=LSN{1A4415/407B3B8}, catalogXmin=2043864902] (io.debezium.connector.postgresql.connection.PostgresConnection) [debezium-postgresconnector-primary-change-event-source-coordinator]
      
      2022-03-15 18:15:09,354 INFO Requested thread factory for connector PostgresConnector, id = primary named = keep-alive (io.debezium.util.Threads) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:09,354 INFO Creating thread debezium-postgresconnector-primary-keep-alive (io.debezium.util.Threads) [debezium-postgresconnector-primary-change-event-source-coordinator]
      
      2022-03-15 18:15:09,421 INFO Searching for WAL resume position (io.debezium.connector.postgresql.PostgresStreamingChangeEventSource) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:18,526 INFO First LSN 'LSN{1A4415/4A291E8}' received (io.debezium.connector.postgresql.connection.WalPositionLocator) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:18,549 INFO LSN after last stored change LSN 'LSN{1A4415/4A2ACF0}' received (io.debezium.connector.postgresql.connection.WalPositionLocator) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:18,549 INFO WAL resume position 'LSN{1A4415/4A2ACF0}' discovered (io.debezium.connector.postgresql.PostgresStreamingChangeEventSource) [debezium-postgresconnector-primary-change-event-source-coordinator]
      
      2022-03-15 18:15:18,575 INFO Initializing PgOutput logical decoder publication (io.debezium.connector.postgresql.connection.PostgresReplicationConnection) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:18,589 INFO Requested thread factory for connector PostgresConnector, id = primary named = keep-alive (io.debezium.util.Threads) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:18,589 INFO Creating thread debezium-postgresconnector-primary-keep-alive (io.debezium.util.Threads) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:18,589 INFO Processing messages (io.debezium.connector.postgresql.PostgresStreamingChangeEventSource) [debezium-postgresconnector-primary-change-event-source-coordinator]
      
      2022-03-15 18:15:24,156 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B326E380} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,156 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{0/0} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,156 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{0/0} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,156 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{0/0} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,156 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{0/0} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B326E3B8} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B3277EF8} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B3277EF8} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B3278038} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B3278038} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B3279688} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B32780E0} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B327CE00} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B3279688} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B327FC78} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B327FC78} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B3286FF8} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B3287AA0} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B328F778} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      2022-03-15 18:15:24,201 INFO Streaming requested from LSN LSN{1A4415/4A29220}, received LSN LSN{1A4415/B328B788} identified as already processed (io.debezium.connector.postgresql.connection.AbstractMessageDecoder) [debezium-postgresconnector-primary-change-event-source-coordinator]
      

      These "identified as already processed" messages will continue forever because it's picking up LSNs that are in the future beyond the resume point.  It has to hit an error where it auto restarts.  It doesn't happen every restart, but eventually it does occur.

      These are the corresponding Postgres logs for the same time:

      2022-03-15 18:15:09 UTC 28239LOG:  connection authorized: user=xxxxx database=xxxxx SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, compression=off)
      2022-03-15 18:15:09 UTC 28240LOG:  connection received: host=xxxxx port=39738
      2022-03-15 18:15:09 UTC 28240LOG:  connection authorized: user=xxxxx database=xxxxx SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, compression=off)
      2022-03-15 18:15:09 UTC 28242LOG:  connection received: host=xxxxx port=39754
      2022-03-15 18:15:09 UTC 28242LOG:  connection authorized: user=xxxxx database=xxxxx SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, compression=off)
      
      2022-03-15 18:15:09 UTC 28245LOG:  connection received: host=xxxxx port=39766
      2022-03-15 18:15:09 UTC 28245LOG:  replication connection authorized: user=xxxxx SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, compression=off)
      2022-03-15 18:15:09 UTC 28249LOG:  connection received: host=xxxxx port=41910
      2022-03-15 18:15:09 UTC 28249LOG:  connection authorized: user=xxxxx database=xxxxx SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, compression=off)
      2022-03-15 18:15:09 UTC 28249LOG:  disconnection: session time: 0:00:00.022 user=xxxxx database=xxxxx host=xxxxx port=41910
      2022-03-15 18:15:09 UTC 28245LOG:  starting logical decoding for slot "debezium"
      2022-03-15 18:15:09 UTC 28245DETAIL:  streaming transactions committing after 1A4415/407B3B8, reading WAL from 1A4414/DEFA18
      2022-03-15 18:15:09 UTC 28245STATEMENT:  START_REPLICATION SLOT "debezium" LOGICAL 1A4415/4A29220 ("proto_version" '1', "publication_names" 'dbz_publication')
      2022-03-15 18:15:09 UTC 28245LOG:  logical decoding found consistent point at 1A4414/DEFA18
      2022-03-15 18:15:09 UTC 28245DETAIL:  Logical decoding will begin using saved snapshot.
      2022-03-15 18:15:09 UTC 28245STATEMENT:  START_REPLICATION SLOT "debezium" LOGICAL 1A4415/4A29220 ("proto_version" '1', "publication_names" 'dbz_publication')
      2022-03-15 18:15:09 UTC 28250LOG:  connection received: host=xxxxx port=39776
      
      2022-03-15 18:15:18 UTC 28242LOG:  disconnection: session time: 0:00:09.378 user=xxxxx database=xxxxx host=xxxxx port=39754
      2022-03-15 18:15:18 UTC 28245LOG:  disconnection: session time: 0:00:09.261 user=xxxxx database=xxxxx host=xxxxx port=39766
      2022-03-15 18:15:18 UTC 28649LOG:  connection received: host=xxxxx port=41500
      2022-03-15 18:15:18 UTC 28653LOG:  connection received: host=[local]
      2022-03-15 18:15:18 UTC 28653LOG:  connection authorized: user=xxxxx database=xxxxx
      2022-03-15 18:15:18 UTC 28649LOG:  replication connection authorized: user=xxxxx SSL enabled (protocol=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384, compression=off)
      2022-03-15 18:15:18 UTC 28649LOG:  starting logical decoding for slot "debezium"
      2022-03-15 18:15:18 UTC 28649DETAIL:  streaming transactions committing after 1A4415/B3277E30, reading WAL from 1A4415/34126D0
      2022-03-15 18:15:18 UTC 28649STATEMENT:  START_REPLICATION SLOT "debezium" LOGICAL 1A4415/4A29220 ("proto_version" '1', "publication_names" 'dbz_publication')
      2022-03-15 18:15:18 UTC 28653LOG:  disconnection: session time: 0:00:00.007 user=xxxxx database=xxxxx host=[local]
      2022-03-15 18:15:18 UTC 28649LOG:  logical decoding found consistent point at 1A4415/34126D0
      2022-03-15 18:15:18 UTC 28649DETAIL:  Logical decoding will begin using saved snapshot.
      2022-03-15 18:15:18 UTC 28649STATEMENT:  START_REPLICATION SLOT "debezium" LOGICAL 1A4415/4A29220 ("proto_version" '1', "publication_names" 'dbz_publication')
      2022-03-15 18:15:18 UTC 28659LOG:  connection received: host=xxxxx port=41000
      

      These are the two lines that seem concerning:

      2022-03-15 18:15:09 UTC 28245DETAIL:  streaming transactions committing after 1A4415/407B3B8, reading WAL from 1A4414/DEFA18
      
      2022-03-15 18:15:18 UTC 28649DETAIL:  streaming transactions committing after 1A4415/B3277E30, reading WAL from 1A4415/34126D0
      

      When Debezium first checks the replication slot for the WAL restart position, I see it starts streaming from 407B3B8. But, when it actually starts the replication for real after finding the WAL resume point, now it streams from B3277E30 which is in the future. So, the replication slot starts streaming from B3277E30 and Debezium waits thinking it's cycling through past events.

      The replication slot gets created by Debezium and is not used by any other process. The publication is called "dbz_publication" and was creating by the root user ahead of time. I only have a single KafkaConnector running with Debezium on this database. But, it looks like something advanced the replication slot beyond where Debezium thinks it left off.

              Unassigned Unassigned
              scott2112-1 Scott Mantegani (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: