The Debezium Postgres connector can run into issues with WAL growth due to how the WAL works with multiple logical databases as documented in this ticket. The work around is described in the documentation here. The work around can be somewhat tricky to manage and it is not a particularly good developer experience. The PG connector should be updated to write back physical changes to a heartbeat table so that the WAL file is pushed forward even when the logical database being streamed is inactive.
It has been suggested that the most logical place for the write back to happen is in the PG connector heartbeat code. I would be happy to work on this feature but I have a few questions about the implementation.
1. Is it possible to use the underlying PG connection that the replication streaming processes uses, or will a new connection need to be made / maintained for the writes?
2. What kind of schema should this heartbeat table have? My company has a table like this:
CREATE TABLE public.debezium_heartbeat ( last_heartbeat_ts TIMESTAMPTZ DEFAULT NOW() PRIMARY KEY ); INSERT INTO public.debezium_heartbeat (last_heartbeat_ts) VALUES (NOW());
and then periodically updates like this
update debezium_heartbeat set last_heartbeat_ts = now();
3. Should the connector be capable of making the table if it does not exist?
4. Should the configuration just be two options added to the heartbeat section like
heartbeat.writeback.table = "public.debezium_heartbeat" // default nulll heartbeat.writeback.enabled = true // default false
and then reuse "heartbeat.interval.ms" for the write back interval?
Any suggestions / feedback is appreciated.