Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-7812

Issues with Delete Event Handling in JDBC Sink with MongoDB Connector

XMLWordPrintable

    • False
    • None
    • False
    • Critical

      In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.

      Bug report

      For bug reports, provide this information, please:

      What Debezium connector do you use and what version?

      Debezium 2.4.2.Final
      (MongoDB Source Connector && JDBC Sink Connector)

      What is the connector configuration?

      {
          "name": "mongodb-source-connector",
          "config": {
              "connector.class" : "io.debezium.connector.mongodb.MongoDbConnector",
              "tasks.max" : "1",
              "topic.prefix" : "mongo",
              "mongodb.connection.string" : "mongodb+srv://********/?ssl=false&authSource=admin",
              "mongodb.user" : "****",
              "mongodb.password" : "****",
              "capture.mode" : "change_streams_update_full",
              "database.include.list" : "****",
              "transforms": "route,unwrap,replacefield",
              "transforms.route.type" : "org.apache.kafka.connect.transforms.RegexRouter",
              "transforms.route.regex" : "([^.]+)\\.([^.]+)\\.([^.]+)",
              "transforms.route.replacement" : "$3",        
              "transforms.unwrap.type": "io.debezium.connector.mongodb.transforms.ExtractNewDocumentState",
              "transforms.unwrap.drop.tombstones":"false",
              "transforms.unwrap.delete.handling.mode":"none",        
              "transforms.replacefield.type": "org.apache.kafka.connect.transforms.ReplaceField$Value",
              "transforms.replacefield.renames": "_id:id"
          }
      } 

       

      {
          "name" : "mongodb-sink-connector",
          "config" : {
              "connector.class" : "io.debezium.connector.jdbc.JdbcSinkConnector",
              "tasks.max" : "1",
              "topics" : "****",
              "connection.url" : "jdbc:mysql://****/",
              "connection.username": "****",
              "connection.password": "****",
              "schema.evolution" : "basic",
              "insert.mode" : "upsert",
              "delete.enabled": "true",
              "primary.key.fields" : "id",
              "primary.key.mode": "record_key"
          }
      } 
      • Changed the primary key field to id for use. (This is irrelevant to the issue)

      What is the captured database version and mode of depoyment?

      (E.g. on-premises, with a specific cloud provider, etc.)

      Source Info
      on-premise
      MongoDB 4.4.17

      Sink Info
      on-premise
      Mysql 8.0.X

      What behaviour do you expect?

      When using the MongoDB ExtractNewDocumentState SMT, if "delete.handling.mode" is set to none (in the latest version "delete.tombstone.handling.mode":"tombstone"), then in the MongoDB source connector, only the value should change to null as per ExtractNewRecordState and the reference description, while the valueSchema should remain the same.

      "none"
      The SMT retains the original change event record from the event stream. The record contains only "value": "null". 

      What behaviour do you see?

      https://github.com/debezium/debezium-connector-jdbc/blob/v2.4.2.Final/src/main/java/io/debezium/connector/jdbc/JdbcChangeEventSink.java#L72-L74

      According to the above code changes, starting from DBZ 2.4, the JDBC Connector skips tombstone messages where value && valueSchema are null, making it impossible to handle delete messages generated by MongoDB ExtractNewDocumentState.

      Do you see the same behaviour using the latest relesead Debezium version?

      (Ideally, also verify with latest Alpha/Beta/CR version)

      This behavior occurs from the Debezium JDBC Sink Connector version 2.4 and above.

      Do you have the connector logs, ideally from start till finish?

      (You might be asked later to provide DEBUG/TRACE level log)

      MongoDB source connecotor log

      [2024-04-18 17:19:34,326] TRACE [andy-mongodb-source-connector8|task-0] Committing record SourceRecord{sourcePartition={rs=cluster, server_id=andy}, sourceOffset={sec=1713428367, ord=1, transaction_id=null, resume_token=826620D78F000000012B022C0100296E5A1004D72DED379106413E95C78129F4A0794D46645F696400645DB12A21842C1A4418044D670004}} ConnectRecord{topic='andy.andytest.mongosample', kafkaPartition=null, key=Struct{id={"$oid": "5db12a21842c1a4418044d67"}}, keySchema=Schema{andy.andytest.mongosample.Key:STRUCT}, value=null, valueSchema=null, timestamp=null, headers=ConnectHeaders(headers=)} (io.debezium.connector.common.BaseSourceTask:318) 

      JDBC sink connector log

      [2024-04-18 17:19:34,329] TRACE [andy-mongodb-sink-connector8|task-0] WorkerSinkTask{id=andy-mongodb-sink-connector8-0} Delivering batch of 1 messages to task (org.apache.kafka.connect.runtime.WorkerSinkTask:585)
      [2024-04-18 17:19:34,329] DEBUG [andy-mongodb-sink-connector8|task-0] Received 1 changes. (io.debezium.connector.jdbc.JdbcSinkConnectorTask:99)
      [2024-04-18 17:19:34,329] TRACE [andy-mongodb-sink-connector8|task-0] Processing SinkRecord{kafkaOffset=5, timestampType=CreateTime} ConnectRecord{topic='mongosample', kafkaPartition=0, key=Struct{id=5db12a21842c1a4418044d67}, keySchema=Schema{STRUCT}, value=null, valueSchema=null, timestamp=1713428374322, headers=ConnectHeaders(headers=)} (io.debezium.connector.jdbc.JdbcChangeEventSink:73)
      [2024-04-18 17:19:34,330] TRACE [andy-mongodb-sink-connector8|task-0] Schema type 'STRING' resolved by name from registry to type 'ConnectStringType' (io.debezium.connector.jdbc.dialect.GeneralDatabaseDialect:477)
      [2024-04-18 17:19:34,330] TRACE [andy-mongodb-sink-connector8|task-0] Field [id] with schema [STRING] (io.debezium.connector.jdbc.SinkRecordDescriptor$FieldDescriptor:201)
      [2024-04-18 17:19:34,330] TRACE [andy-mongodb-sink-connector8|task-0]     Type      : io.debezium.connector.jdbc.type.connect.ConnectStringType (io.debezium.connector.jdbc.SinkRecordDescriptor$FieldDescriptor:202)
      [2024-04-18 17:19:34,330] TRACE [andy-mongodb-sink-connector8|task-0]     Optional  : true (io.debezium.connector.jdbc.SinkRecordDescriptor$FieldDescriptor:203)
      [2024-04-18 17:19:34,330] DEBUG [andy-mongodb-sink-connector8|task-0] Skipping tombstone record io.debezium.connector.jdbc.SinkRecordDescriptor@69c0dc8 (io.debezium.connector.jdbc.JdbcChangeEventSink:91) 

      How to reproduce the issue using our tutorial deployment?

      <Your answer>

      Feature request or enhancement

      For feature requests or enhancements, provide this information, please:

      Which use case/requirement will be addressed by the proposed feature?

      The delete event is properly executed in the JDBC sink connector with the MongoDB source connector.

      Implementation ideas (optional)

      It seems that MongoDB's ExtractNewDocumentState should also follow the properties of ExtractNewRecordState, and we need to assign a valueDocument even in the case of a delete event, which should then be used to generate the finalValueSchema.

      https://github.com/debezium/debezium/blob/main/debezium-connector-mongodb/src/main/java/io/debezium/connector/mongodb/transforms/ExtractNewDocumentState.java

            Unassigned Unassigned
            hanju.lee Hanju Lee
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: