Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-7026

Deserialization Errors - 200k insert rows

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Major Major
    • None
    • None
    • mysql-connector
    • None
    • False
    • None
    • False

      What Debezium connector do you use and what version?

      Debezium Server 2.5.0-SNAPSHOT.

      What is the connector configuration?

      DEBEZIUM_SINK_TYPE: kinesis
      DEBEZIUM_SINK_KINESIS_REGION: awsregion
      DEBEZIUM_SOURCE_CONNECTOR_CLASS: io.debezium.connector.mysql.MySqlConnector
      DEBEZIUM_SOURCE_OFFSET_STORAGE_FILE_FILENAME: data/offsets.dat
      DEBEZIUM_SOURCE_OFFSET_FLUSH_INTERVAL_MS: 60000
      DEBEZIUM_SOURCE_OFFSET_FLUSH_TIMEOUT_MS: 10000
      DEBEZIUM_SOURCE_MAX_REQUEST_SIZE: 10485760
      DEBEZIUM_SOURCE_MAX_QUEUE_SIZE: 81290
      DEBEZIUM_SOURCE_MAX_BATCH_SIZE: 20480
      DEBEZIUM_SOURCE_SNAPSHOT_MODE: schema_only
      DEBEZIUM_SOURCE_SNAPSHOT_LOCKING_MODE: none
      DEBEZIUM_SOURCE_DECIMAL_HANDLING_MODE: double
      DEBEZIUM_SOURCE_DATABASE_INCLUDE_LIST: dbname
      DEBEZIUM_SOURCE_TOPIC_PREFIX: dbz
      DEBEZIUM_SOURCE_SCHEMA_HISTORY_INTERNAL: io.debezium.storage.file.history.FileSchemaHistory
      DEBEZIUM_SOURCE_SCHEMA_HISTORY_INTERNAL_FILE_FILENAME: data/schema_history.dat
      EBEZIUM_SOURCE_SCHEMA_HISTORY_INTERNAL_STORE_ONLY_CAPTURED_DATABASES_DDL: True
      DEBEZIUM_SOURCE_EVENT_PROCESSING_FAILURE_HANDLING_MODE: warn

      What is the captured database version and mode of depoyment?

      (E.g. on-premises, with a specific cloud provider, etc.)

      AWS RDS - MariaDB 10.4.26

      What behaviour do you expect?

      Without errors.

      What behaviour do you see?

      I'm running Debezium Server (2.5.0) on AWS to CDC my db (mariadb-10.4.26) to the datalake, but I'm getting a lot of deserialization errors. I checked what could be happening and found that it was always with DMLs that has a huge amount of rows, something like 100-200k. I tried to increase the request-size, queue-size and batch-size, but without success.

       

      { "timestamp": "2023-10-10T14:13:10.084Z", "sequence": 3594, "loggerClassName": "org.slf4j.impl.Slf4jLogger", "loggerName": "io.debezium.connector.mysql.MySqlStreamingChangeEventSource", "level": "WARN", "message": "A deserialization failure event arrived", "threadName": "blc-XXXXXX3306", "threadId": 141, "mdc":

      { "dbz.taskId": "0", "dbz.connectorName": "dbz", "dbz.connectorType": "MySQL", "dbz.connectorContext": "binlog" }

      , "ndc": "", "hostName": "XXXXXX", "processName": "io.debezium.server.Main", "processId": 1, "exception": { "refId": 1, "exceptionType": "com.github.shyiko.mysql.binlog.event.deserialization.EventDataDeserializationException", "message": "Failed to deserialize data of EventHeaderV4{timestamp=1696946983000, eventType=WRITE_ROWS, serverId=592457888, headerLength=19, dataLength=8194, nextPosition=625444, flags=0}", "frames": [ { "class": "com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer", "method": "deserializeEventData", "line": 343 }, { "class": "com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer", "method": "nextEvent", "line": 246 }, { "class": "io.debezium.connector.mysql.MySqlStreamingChangeEventSource$1", "method": "nextEvent", "line": 233 }, { "class": "com.github.shyiko.mysql.binlog.BinaryLogClient", "method": "listenForEventPackets", "line": 1051 }, { "class": "com.github.shyiko.mysql.binlog.BinaryLogClient", "method": "connect", "line": 631 }, { "class": "com.github.shyiko.mysql.binlog.BinaryLogClient$7", "method": "run", "line": 932 }, { "class": "java.lang.Thread", "method": "run", "line": 829 } ], "causedBy": { "exception": { "refId": 2, "exceptionType": "com.github.shyiko.mysql.binlog.event.deserialization.MissingTableMapEventException", "message": "No TableMapEventData has been found for table id:351. Usually that means that you have started reading binary log 'within the logical event group' (e.g. from WRITE_ROWS and not proceeding TABLE_MAP", "frames": [

      { "class": "com.github.shyiko.mysql.binlog.event.deserialization.AbstractRowsEventDataDeserializer", "method": "deserializeRow", "line": 109 }

      ,

      { "class": "com.github.shyiko.mysql.binlog.event.deserialization.WriteRowsEventDataDeserializer", "method": "deserializeRows", "line": 64 }

      ,

      { "class": "com.github.shyiko.mysql.binlog.event.deserialization.WriteRowsEventDataDeserializer", "method": "deserialize", "line": 56 }

      ,

      { "class": "com.github.shyiko.mysql.binlog.event.deserialization.WriteRowsEventDataDeserializer", "method": "deserialize", "line": 32 }

      ,

      { "class": "com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer", "method": "deserializeEventData", "line": 337 }

      ,

      { "class": "com.github.shyiko.mysql.binlog.event.deserialization.EventDeserializer", "method": "nextEvent", "line": 246 }

      ,

      { "class": "io.debezium.connector.mysql.MySqlStreamingChangeEventSource$1", "method": "nextEvent", "line": 233 }

      ,

      { "class": "com.github.shyiko.mysql.binlog.BinaryLogClient", "method": "listenForEventPackets", "line": 1051 }

      ,

      { "class": "com.github.shyiko.mysql.binlog.BinaryLogClient", "method": "connect", "line": 631 }

      ,

      { "class": "com.github.shyiko.mysql.binlog.BinaryLogClient$7", "method": "run", "line": 932 }

      ,

      { "class": "java.lang.Thread", "method": "run", "line": 829 }

      ] } } } }

       

      { "timestamp": "2023-10-10T14:13:10.085Z", "sequence": 3595, "loggerClassName": "org.slf4j.impl.Slf4jLogger", "loggerName": "io.debezium.connector.mysql.MySqlStreamingChangeEventSource", "level": "WARN", "message": "Error during binlog processing. Last offset stored = {transaction_id=null, ts_sec=1696946983, file=mysql-bin-changelog.580781, pos=0, server_id=592457888, event=1}, binlog reader near position = mysql-bin-changelog.580781/551527", "threadName": "XXXXXX:3306", "threadId": 141, "mdc":

      { "dbz.taskId": "0", "dbz.connectorName": "dbz", "dbz.connectorType": "MySQL", "dbz.connectorContext": "binlog" }

      , "ndc": "", "hostName": "XXXXXXXX", "processName": "io.debezium.server.Main", "processId": 1 }

       

      The statement is a create or replace table. Reading the BinLog file directly it represents a 200k inserts.

      CREATE OR REPLACE TABLE table AS
      select distinct(id)
      from example ex
      where date_start >= DATE_ADD(now(),INTERVAL -45 DAY)

      Do you see the same behaviour using the latest relesead Debezium version?

      (Ideally, also verify with latest Alpha/Beta/CR version)

      Yes, on all versions. I'm already using the latest code.

      Do you have the connector logs, ideally from start till finish?

      (You might be asked later to provide DEBUG/TRACE level log)

      Yes.

      How to reproduce the issue using our tutorial deployment?

      Send a CREATE OR REPLACE TABLE statement to DB with more than 200k rows on the select clause.

      CREATE OR REPLACE TABLE table AS
      select distinct(id)
      from example ex
      where date_start >= DATE_ADD(now(),INTERVAL -45 DAY)

              Unassigned Unassigned
              brenoavm Breno Moreira (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: