Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-9338

Events may be mistakenly processed multiple times using multiple tasks

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 3.2.1.Final, 3.3.0.Alpha2
    • 3.0.9.Final, 3.1.3.Final, 3.2.0.Final, 3.3.0.Alpha1
    • jdbc-connector
    • None

      When configuring the JDBC sink with multiple tasks consuming from a topic with multiple partitions, the events may be duplicated.

      When looking at the partition-open-close-flushes.log attachment, we can see that the same partitions are requested to be opened on multiple threads, and later, when the first flush operation is performed, no offsets are flushed despite having received events on tasks 3 and 2, and the flush occurring on task 4.

      This seems to indicate that the partition and offset management introduced in DBZ-7946 was not correct when multiple tasks were being used.

      The main problem here is that the JDBC sink does not accurately track OffsetMetadata for TopicPartition. If a specific partition is assigned to a task, never receives any changes, and a commit request is observed, the flush writes no offset details, which effectively resets the position, and the events are redelivered.

              ccranfor@redhat.com Chris Cranford
              ccranfor@redhat.com Chris Cranford
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: