Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-2382

Support emitting TRUNCATE events in PostgreSQL pgoutput plugin


      PostgreSQL pgoutput plugin supports TRUNCATE events, but right now we are simply skipping it as like COMMIT/BEGIN/RELATION. Current TRUNCATE skipping logic was introduced on DBZ-1576 where it was decided to support TRUNCATE events in future by publishing it to separate topic / some other mechanism. I have provided my use case and possibility to support TRUNCATE events in the same dml publishing flow in comment. I will provide the same description here.

      Unlike other databases, TRUNCATE and DDL statements (excluding CREATE DATABASE) are covered under transactions in PostgreSQL and also supported with Logical Decoding from PG 11+. This is one of the main reasons we chose PostgreSQL over MySQL / Oracle. We have a strict data consistency requirement, so we achieve dynamic schema and table creations, TRUNCATE statements inside transaction and rollback the ddl, dml and truncate even if there is any problem with dml data inside that transaction.

      We are using Debezium Embedded Engine to consolidate multiple databases into a single database with the target database tables partitioned by tenant_id column (PostgreSQL 12 -> Debezium Embedded Engine in MicroService -> Apache Pulsar -> Pulsar Subscription MicroService -> Target PostgreSQL 12 Partitioned Tables). One limitation we are facing is the unavailability of TRUNCATE events in Debezium Embedded Engine. As per the previous issue comments (DBZ-1576), publishing TRUNCATE events in a separate topic will introduce difficulty in maintaining that strict data consistency in target database. Sometimes the truncate events from separate topic can be seen too early or seen too late and this results in data inconsistencies at target database.

      As like the PostgreSQL transactional guarantees, I would recommend to support TRUNCATE events in the same dml publishing flow so that the Embedded Engine consumer can decide on how to handle and apply it in the target system with strict data consistency guarantees. Regarding backward compatibility, we can have a debezium property which by default skips TRUNCATE events. One can enable this property if they're sure that the consumer supports it. We could also have a debezium property to specify the key for TRUNCATE events (if we are worried that TRUNCATE events won't have any key).

            Unassigned Unassigned
            krnaveen14 Naveen Kumar (Inactive)
            0 Vote for this issue
            3 Start watching this issue