-
Feature Request
-
Resolution: Done
-
Major
-
None
-
None
We have a table defined like so:
mysql> desc apps; +----------------+--------------+------+-----+-------------------+-----------------------------+ | Field | Type | Null | Key | Default | Extra | +----------------+--------------+------+-----+-------------------+-----------------------------+ | id | bigint(20) | NO | PRI | NULL | auto_increment | | name | varchar(255) | YES | | NULL | | | user_id | bigint(20) | YES | | NULL | | | group_id | bigint(20) | YES | | NULL | | | ts_create_time | timestamp | NO | MUL | CURRENT_TIMESTAMP | | | ts_modify_time | timestamp | NO | MUL | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | | version | int(11) | YES | | NULL | | | settings | json | YES | | NULL | | +----------------+--------------+------+-----+-------------------+-----------------------------+ 8 rows in set (0.01 sec)
The Avro schema that Debezium+Kafka connect+Kafka Avro encoder produce is:
{ "type": "record", "name": "Envelope", "namespace": "apps", "fields": [ { "name": "before", "type": [ "null", { "type": "record", "name": "Value", "fields": [ { "name": "id", "type": "long" }, { "name": "name", "type": [ "null", "string" ] }, { "name": "user_id", "type": [ "null", "long" ] }, { "name": "group_id", "type": [ "null", "long" ] }, { "name": "ts_create_time", "type": { "type": "string", "connect.version": 1, "connect.name": "io.debezium.time.ZonedTimestamp" } }, { "name": "ts_modify_time", "type": { "type": "string", "connect.version": 1, "connect.name": "io.debezium.time.ZonedTimestamp" } }, { "name": "version", "type": [ "null", "int" ] }, { "name": "settings", "type": [ "null", { "type": "string", "connect.version": 1, "connect.name": "io.debezium.data.Json" } ] } ], "connect.name": "apps.Value" } ] }, { "name": "after", "type": [ "null", "Value" ] }, { "name": "source", "type": { "type": "record", "name": "Source", "namespace": "io.debezium.connector.mysql", "fields": [ { "name": "name", "type": "string" }, { "name": "server_id", "type": "long" }, { "name": "ts_sec", "type": "long" }, { "name": "gtid", "type": [ "null", "string" ] }, { "name": "file", "type": "string" }, { "name": "pos", "type": "long" }, { "name": "row", "type": "int" }, { "name": "snapshot", "type": [ "null", "boolean" ] }, { "name": "thread", "type": [ "null", "long" ] }, { "name": "db", "type": [ "null", "string" ] }, { "name": "table", "type": [ "null", "string" ] } ], "connect.name": "io.debezium.connector.mysql.Source" } }, { "name": "op", "type": "string" }, { "name": "ts_ms", "type": [ "null", "long" ] } ], "connect.version": 1, "connect.name": "apps.Envelope" }
I believe that the fields that are NULLABLE with a default NULL (name, user_id, group_id, etc) in the above MySQL schema should have a "default": null after the type field. They do not. This is lossy behavior, and makes future backwards compatible changes in MySQL incompatible once they get translated to Avro.
See DBZ-190 for a similar problem as well.