Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-6043

Vitess: Support Mapping unsigned bigint mysql column type to long

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Done
    • Icon: Major Major
    • 2.2.0.Alpha2
    • 1.9.7.Final
    • vitess-connector
    • None

      In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.

      Bug report

      For bug reports, provide this information, please:

      What Debezium connector do you use and what version?

      Vitess Connector

      What is the connector configuration?

      Vanilla

      What is the captured database version and mode of depoyment?

      (E.g. on-premises, with a specific cloud provider, etc.)

      Vitess 13 on Mysql 5.6 in AWS

      What behaviour do you expect?

      The columns in MySQL/Vitess is defined as `unsigned bigint(20)`, currently Vitess Connector is mapping this column type as JDBC.String type and it becomes `string` in Avro schema when publishing to schema registry.  However, the debezium changelog output is integrated with Hudi for bootstrap/delta stream integration.  Hudi is mapping this column type as `long` on the bootstrapping path which caused the schema mismatch.  We would prefer to support mapping `unsigned bigint` to `long` in vitess debezium connector as a configuration option.

      What behaviour do you see?

      The columns in MySQL/Vitess is defined as `unsigned bigint(20)`, currently Vitess Connector is mapping this column type as JDBC.String type

      Do you see the same behaviour using the latest relesead Debezium version?

      (Ideally, also verify with latest Alpha/Beta/CR version)

      Yes

      Do you have the connector logs, ideally from start till finish?

      (You might be asked later to provide DEBUG/TRACE level log)

      TBD

      How to reproduce the issue using our tutorial deployment?

      Define a column with type `unsigned bigint` on MySQL/Vitess side and check the envelope schema in schema registry.

      Feature request or enhancement

      For feature requests or enhancements, provide this information, please:

      Which use case/requirement will be addressed by the proposed feature?

      <Your answer>

      Implementation ideas (optional)

      In theory, unsigned bigint has the range of  0 to 18,467,440,737,095,551,615

      and long has the range of  -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 .

      However `long` type would be OK in practice since those columns usually stores the ids and ids don't go over to 9,223,372,036,854,775,807.

      Many data systems (e.g. Hudi) chose to do a rough mapping since it's difficult to find an unsigned bigint type in either Java or Avro.

      The problem is difficult to solve in debezium since by default debezium schema builder is based on JDBC type and JDBC doesn't have the notion of unsigned integer.     

      This problem was also encountered in Vitess MySQL Connector and over there a config option `bigint.unsigned.handling.mode` was introduced to support mapping bigint.unsigned to long.

      I think we can follow the MySQL connector example to use a config option to allow customer to choose to map unsigned int to long.

              Unassigned Unassigned
              haiyingcai Henry Haiying Cai (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: