Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-6051

Incremental snapshot sends the events from signalling DB to Kafka


    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • 2.2.0.Alpha2
    • None
    • mongodb-connector
    • None
    • False
    • None
    • False
    • Important

      Bug report

      For bug reports, provide this information, please:

      What Debezium connector do you use and what version?

      The latest version I pulled from the master branch a few days ago. 

      What is the connector configuration?


      signal.data.collection: "debezium.a_signaling"
      database.include.list: "api,debezium"
      collection.include.list: "api.a,debezium.a_signaling"
      snapshot.mode: "never" 


      What is the captured database version and mode of deployment?

      MongoDB Atlas version 4.2

      What behavior do you expect?

      Signaling collection or database to be monitored separately from the other events in the database. 

      What behavior do you see?

      For the signaling to work I have to add the signal database to the `database.include.list` otherwise it won't work.

      But after adding the database to the list the connector starts picking up the CDC events from that database ( all collections ) and syncing them to the Kafka. The events are mostly window-open and window-close documents. 

      This creates a lot of noise and unnecessary traffic on our topics and also impacts the totalNumberOfEventsSeens in the streaming. The total number seems to be events from both a and a_signaling collections. 

      To minimize the noise I also added the signaling collection to the `collection.include.list` as well hoping that it'd only monitor the CDC events from this one collection instead of the entire debezium database. 

      Do you see the same behavior using the latest released Debezium version?


      Do you have the connector logs, ideally from start to finish?

      I don't see any logs that might help with this issue. Please let me know if I can share them 

      How to reproduce the issue using our tutorial deployment?

      Run the master version of the debezium with the above configuration. Create a separate signaling database and trigger the incremental snapshot by adding a new execute-snapshot request. You'd see that the events from the signaling collection are also sent to the Kafka and CDC metrics are no longer reporting a reasonable number which indicates the traffic in your database. 

            jcechace@redhat.com Jakub Čecháček
            thr.firoozian@gmail.com Tahereh Firoozian (Inactive)
            0 Vote for this issue
            4 Start watching this issue