Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-4983

Unsupported MySQL Charsets during Snapshotting for fields with custom converter

XMLWordPrintable

    • False
    • None
    • False
    • Hide

      Setup a mysql debezium connector on a table that has the charset `utf8mb4` set as default and add a VARCHAR column to this table. 
      Next register a custom converter for this column and let the connector snapshot 

      Show
      Setup a mysql debezium connector on a table that has the charset `utf8mb4` set as default and add a VARCHAR column to this table.  Next register a custom converter for this column and let the connector snapshot 

      Bug report

      When snapshotting a mysql table with text columns that have a custom converter registered to it, it unable to convert the string to bytes correctly. `AbstractMysqlFieldReader:74` it's using the mysql charset directly e.g. `utf8mb4` instead of converting this to a Java supported Charset. This will always cause a UnsupportedEncodingException to be thrown and caught.

       

      This logic is surrounded in a try catch. which won't cause any operational problems although it does come with a huge performance penalty. In my local testing I noticed a 10x increase in the amount of rows processed in the same amount of time when the proper charset is used to encode it.

            Unassigned Unassigned
            larswerkman Lars Werkman (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: