-
Enhancement
-
Resolution: Done
-
Major
-
1.9.7.Final
-
None
-
False
-
None
-
False
In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.
Bug report
For bug reports, provide this information, please:
What Debezium connector do you use and what version?
Vitess Connector
What is the connector configuration?
Vanilla
What is the captured database version and mode of depoyment?
(E.g. on-premises, with a specific cloud provider, etc.)
Vitess 13 on Mysql 5.6 in AWS
What behaviour do you expect?
The columns in MySQL/Vitess is defined as `unsigned bigint(20)`, currently Vitess Connector is mapping this column type as JDBC.String type and it becomes `string` in Avro schema when publishing to schema registry. However, the debezium changelog output is integrated with Hudi for bootstrap/delta stream integration. Hudi is mapping this column type as `long` on the bootstrapping path which caused the schema mismatch. We would prefer to support mapping `unsigned bigint` to `long` in vitess debezium connector as a configuration option.
What behaviour do you see?
The columns in MySQL/Vitess is defined as `unsigned bigint(20)`, currently Vitess Connector is mapping this column type as JDBC.String type
Do you see the same behaviour using the latest relesead Debezium version?
(Ideally, also verify with latest Alpha/Beta/CR version)
Yes
Do you have the connector logs, ideally from start till finish?
(You might be asked later to provide DEBUG/TRACE level log)
TBD
How to reproduce the issue using our tutorial deployment?
Define a column with type `unsigned bigint` on MySQL/Vitess side and check the envelope schema in schema registry.
Feature request or enhancement
For feature requests or enhancements, provide this information, please:
Which use case/requirement will be addressed by the proposed feature?
<Your answer>
Implementation ideas (optional)
In theory, unsigned bigint has the range of 0 to 18,467,440,737,095,551,615
and long has the range of -9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 .
However `long` type would be OK in practice since those columns usually stores the ids and ids don't go over to 9,223,372,036,854,775,807.
Many data systems (e.g. Hudi) chose to do a rough mapping since it's difficult to find an unsigned bigint type in either Java or Avro.
The problem is difficult to solve in debezium since by default debezium schema builder is based on JDBC type and JDBC doesn't have the notion of unsigned integer.
This problem was also encountered in Vitess MySQL Connector and over there a config option `bigint.unsigned.handling.mode` was introduced to support mapping bigint.unsigned to long.
I think we can follow the MySQL connector example to use a config option to allow customer to choose to map unsigned int to long.
- links to
-
RHEA-2023:120698 Red Hat build of Debezium 2.3.4 release