Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-2911

Provide LSN coordinates as standardized sequence field

    XMLWordPrintable

Details

    • False
    • False
    • 0
    • 0% 0%
    • Undefined

    Description

      Over at Materialize we recently realized that the LSN information provided in the PostgreSQL record metadata is insufficient to perform O(1) deduplication: https://github.com/MaterializeInc/materialize/issues/5262

      With the MySQL connector, binlog offsets are monotonically increasing per file, so we can simply track the highest offset we've seen in each file. If that offset ever goes backwards, we know we're looking at a duplicate.

      Unfortunately, as we recently learned, the `lsn` field in the PostgreSQL records can go backwards in the face of interleaved transactions. Only the LSN of transaction commit records are guaranteed to monotonically increase. 

      Would you folks be open to also including a `txn_final_lsn` field in the record metadata? This field would expose the "final LSN" of the transaction with which the record is associated, which is exposed by the logical streaming replication protocol: https://www.postgresql.org/docs/10/protocol-logicalrep-message-formats.html.

       

      Or, is there some other metadata we could use to perform O(1) dedupe instead?

      Attachments

        Activity

          People

            Unassigned Unassigned
            benesch Nikhil Benesch (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: