Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-1182

issues with heartbeat topic and wal_log growth

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not a Bug
    • Icon: Blocker Blocker
    • None
    • 0.9.2.Final
    • postgresql-connector
    • None
    • Hide
      1. start debezium
      2. create events in separate database from whitelisted tables
        1. wal_log will grow and not decrease
      3. create event in same database but non-whitelisted table
        1. wal_log will not decrease
      4. create event against whitelisted table
        1. wal_log will decrease
      Show
      start debezium create events in separate database from whitelisted tables wal_log will grow and not decrease create event in same database but non-whitelisted table wal_log will not decrease create event against whitelisted table wal_log will decrease

      we seem to be having issues with the heartbeat topic and wal_log growth. We are using version postgres-0.9.2.Final with PG11. Here are the details of the scenario.

      Before we start debezium :

      • pg_wal directory is 8.1G, right at the max_wal_size we have set of 8G

      We start debezium and it successfully takes a snapshot.

      • pg_wal size is still 8.1G
      • we have a replication_slot
      • there is a trace message about the heartbeat topic :
         [2019-03-12 16:03:35,072] TRACE Successfully produced messages to __debezium-heartbeat.dbz-0 with base offset 0. (org.apache.kafka.clients.producer.internals.ProducerBatch:190) 

      We then run pgbench 2 times

       pgbench -i -s 1000 example 

      against an example db which is a separate db from the db that our whitelisted tables live in. After the 2 runs :

      • pg_wal dir has grown to 23G
      • the LSN values for our replication slot have NOT been updated
      • there is no heartbeat trace message

      We then update a table in the same db our whitelisted tables live in, but the table is not whitelisted.

      • pg_wal dir is still 23G
      • the LSN values for our replication slot HAVE been updated
      • there is a trace message about the heartbeat topic :
         [2019-03-12 16:12:25,958] TRACE Successfully produced messages to __debezium-heartbeat.dbz-0 with base offset 1. (org.apache.kafka.clients.producer.internals.ProducerBatch:190) 

      We then update a whitelisted table.

      • pg_wal decreases to 7.6G
      • the LSN values for our replication slot HAVE been updated
      • there is a trace message about the heartbeat topic :
         [2019-03-12 16:14:48,455] TRACE Successfully produced messages to __debezium-heartbeat.dbz-0 with base offset 2. (org.apache.kafka.clients.producer.internals.ProducerBatch:190) 

      My observations :

      • It seems an event in a separate database on the same server creates wal_log entries that are NOT captured in the LSNs for the replication_slot and are never flushed to kafka which keeps them around.
      • It seems an event in the same database but a non-whitelisted table creates wal_log entries that ARE captured in the LSNs for the replication_slot but are never flushed to kafka which keeps them around.
      • It seems ONLY an event to a whitelisted table creates wal_log entries that ARE captured in the LSNs for the replication_slot and ARE flushed to kafka.
      • It seems the heartbeat topic only fires during events in the same db as whitelisted tables and doesnt seem to strictly obey the interval value set in the properties. It also seems to only update the LSNs for the replication_slot and not actually flush to kafka.

      Do you have any thoughts about the above? Is what I describe expected behavior? Please let me know any questions or if you need additional information

              Unassigned Unassigned
              dajerome David Jerome (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: