Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-8236

Debezium Server messages not being sent to Pub/Sub after restart

XMLWordPrintable

    • Important

      Bug report
      Discussed in this issue here: https://debezium.zulipchat.com/#narrow/stream/350571-community-dbz-server/topic/Debezium.20dropping.20messages.20sent.20to.20Pub.2FSub

      It appears that, when certain errors are encountered and Debezium Server is restarted several times in a row, the offset is advanced despite errors being received and these messages are then never sent to pub/sub.

      In production, I have previously seen this happen when the UNAVAILABLE pub/sub error is encountered. In these cases, our data engineers restart the connector multiple times as Google lists this as a transient condition: https://cloud.google.com/pubsub/docs/reference/error-codes

      What Debezium connector do you use and what version?
      My tests were done using Debezium 2.6.2 but we have observed this behaviour across different versions

      What is the connector configuration?
      Config is attached

      What is the captured database version and mode of deployment?
      My test instance is using Mysql as source but we have also observed this issue when using the Oracle connector with Oracle 19C (all on-prem)

      What behavior do you expect?

      If a message fails to send to pub/sub, Debezium should not advance its low watermark past that batch to avoid messages being lost

      What behavior do you see?

      Messages are lost after multiple restarts of Debezium Server

      Do you see the same behaviour using the latest released Debezium version?

      Not yet tested.

      Do you have the connector logs, ideally from start till finish?

      Yes, see attached trace logs including all restarts

      How to reproduce the issue

      1) I create a topic in pub/sub called all_topics
      2) In a local mysql instance, I have a procedure that inserts rows into a table every 10 seconds for 5 minutes (so 300 total rows)
      3) I start up Debezium Server for the first time and let everything initialize, then I stop Debezium.
      4) I start my procedure that inserts the 300 rows over 5 minutes
      5) I change the config to point to a pub/sub topic that hasn't been created yet (so something I know will error)
      6) When debezium server crashes due to the NOT_FOUND error, I restart Debezium Server several times (simulating an auto startup process)
      7) I correct the config issue and start up Debezium again (about 2.5 minutes into the procedure running)
      8) I check my pub/sub subscription and only 170 messages are shown

        1. screenshot-1.png
          screenshot-1.png
          16 kB
        2. logfile (1).log
          17.11 MB
        3. config_ (1).txt
          2 kB

              Unassigned Unassigned
              nathan-smit-1 Nathan Smit
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: