-
Bug
-
Resolution: Unresolved
-
Critical
-
3.2.0.Final
-
None
Bug report
For bug reports, provide this information, please:
What Debezium connector do you use and what version?
Debezium Postgres v3.2.0
What is the connector configuration?
{ "database.dbname": "mydb", "database.hostname": "<HOSTNAME>", "database.password": "<PASSWORD>", "database.port": "5432", "database.user": "postgresadmin", "heartbeat.interval.ms": "0", "name": "myconnector", "publication.name": "dbz_publication", "slot.name": "debezium", "snapshot.mode": "initial", "table.include.list": "public.entry,public.event,public.logbook", "tasks.max": "1", "topic.prefix": "mydb-prod" }
What is the captured database version and mode of deployment?
On-premise Postgres v16
What behavior do you expect?
Expected the connector to resume normally after a restart.
What behavior do you see?
The connector failed with the following exception:
io.debezium.DebeziumException: The connector is trying to read change stream starting at PostgresOffsetContext [sourceInfoSchema=Schema{io.debezium.connector.postgresql.Source:STRUCT}, sourceInfo=source_info[server='mydb-prod'db='mydb', lsn=LSN{5/2B1F7590}, txId=81993951, messageType=INSERT, lastCommitLsn=LSN{5/2B1F7590}, timestamp=2025-10-07T05:45:39.078148Z, snapshot=FALSE, schema=, table=], lastSnapshotRecord=false, lastCompletelyProcessedLsn=LSN{5/2B2E2560}, lastCommitLsn=LSN{5/2B1F7590}, streamingStoppingLsn=null, transactionContext=TransactionContext [currentTransactionId=null, perTableEventCount={}, totalEventCount=0], incrementalSnapshotContext=IncrementalSnapshotContext [windowOpened=false, chunkEndPosition=null, dataCollectionsToSnapshot=[], lastEventKeySent=null, maximumKey=null]], but this is no longer available on the server. Reconfigure the connector to use a snapshot mode when needed. at io.debezium.connector.common.BaseSourceTask.validateSchemaHistory(BaseSourceTask.java:157)
Do you see the same behaviour using the latest released Debezium version?
(Ideally, also verify with latest Alpha/Beta/CR version)
I checked the latest version, and the same code is still present, so the bug exists there as well.
Do you have the connector logs, ideally from start till finish?
(You might be asked later to provide DEBUG/TRACE level log)
// Stored offsets for the connector [ { "partition": { "server": "mydb-prod" }, "offset": { "lsn": 22199280992, "lsn_commit": 22198318480, "lsn_proc": 22199280992, "messageType": "INSERT", "ts_usec": 1759815939078148, "txId": 81993951 } } ] // Important Logs io.debezium.connector.common.BaseSourceTask Found previous partition offset PostgresPartition [sourcePartition={server=mydb-prod}]: {lsn_proc=22199280992, messageType=INSERT, lsn_commit=22198318480, lsn=22198318480, txId=81993951, ts_usec=1759815939078148} io.debezium.connector.postgresql.connection.PostgresConnection Slot 'debezium' has restart LSN 'LSN{5/2B2DED98}' io.debezium.connector.postgresql.connection.PostgresConnection Obtained valid replication slot ReplicationSlot [active=false, latestFlushedLsn=LSN{5/2B2E2560}, catalogXmin=81993897] The connector is trying to read change stream starting at PostgresOffsetContext [sourceInfoSchema=Schema{io.debezium.connector.postgresql.Source:STRUCT}, sourceInfo=source_info[server='mydb-prod'db='mydb', lsn=LSN{5/2B1F7590}, txId=81993951, messageType=INSERT, lastCommitLsn=LSN{5/2B1F7590}, timestamp=2025-10-07T05:45:39.078148Z, snapshot=FALSE, schema=, table=], lastSnapshotRecord=false, lastCompletelyProcessedLsn=LSN{5/2B2E2560}, lastCommitLsn=LSN{5/2B1F7590}, streamingStoppingLsn=null, transactionContext=TransactionContext [currentTransactionId=null, perTableEventCount={}, totalEventCount=0], incrementalSnapshotContext=IncrementalSnapshotContext [windowOpened=false, chunkEndPosition=null, dataCollectionsToSnapshot=[], lastEventKeySent=null, maximumKey=null]], but this is no longer available on the server. Reconfigure the connector to use a snapshot mode when needed.
Another issue seen from the logs:
Even though the lsn field has the same value as the lsn_proc field in the stored offsets, the value printed in the logs for lsn corresponds to lsn_commit.
How to reproduce the issue using our tutorial deployment?
The reproduction depends on the LSN value assigned as the restart_lsn. If, after flushing the latest LSN, the slot’s restart_lsn is set to a value greater than the last processed commit LSN, the issue will occur.
Proposed Fix
1. Incorrect log position validation
While comparing the stored offset LSN with the restart_lsn of the slot, we use the lsn_commit value (ref). It should be lsn_proc (lastCompletelyProcessedLsn) value.
T1: BEGIN: 50 T1: COMMIT: 100 T2: BEGIN: 150 <--- Connector restart with lsn_proc=170, lsn_commit=100 [Flush 170] T2: COMMIT: 200 The restart_lsn can very well fall between LSN 100 and 150.
To temporarily mitigate the issue, the connector offsets were updated by setting the lsn_commit value to match the lsn_proc, followed by a connector restart. After this change, the connector resumed normal operation without further errors.
2. Incorrect value being assigned to lsn field
We assign the value of lsn_commit to the lsn field each time the PostgresOffsetContext is intialized. (ref1 and ref2)
We can call sourceInfo.updateLastCommit before calling sourceInfo.update to set the correct value for the lsn field.