-
Feature Request
-
Resolution: Done
-
Major
-
None
-
None
We have a table defined like so:
mysql> desc apps; +----------------+--------------+------+-----+-------------------+-----------------------------+ | Field | Type | Null | Key | Default | Extra | +----------------+--------------+------+-----+-------------------+-----------------------------+ | id | bigint(20) | NO | PRI | NULL | auto_increment | | name | varchar(255) | YES | | NULL | | | user_id | bigint(20) | YES | | NULL | | | group_id | bigint(20) | YES | | NULL | | | ts_create_time | timestamp | NO | MUL | CURRENT_TIMESTAMP | | | ts_modify_time | timestamp | NO | MUL | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | | version | int(11) | YES | | NULL | | | settings | json | YES | | NULL | | +----------------+--------------+------+-----+-------------------+-----------------------------+ 8 rows in set (0.01 sec)
The Avro schema that Debezium+Kafka connect+Kafka Avro encoder produce is:
{
"type": "record",
"name": "Envelope",
"namespace": "apps",
"fields": [
{
"name": "before",
"type": [
"null",
{
"type": "record",
"name": "Value",
"fields": [
{
"name": "id",
"type": "long"
},
{
"name": "name",
"type": [
"null",
"string"
]
},
{
"name": "user_id",
"type": [
"null",
"long"
]
},
{
"name": "group_id",
"type": [
"null",
"long"
]
},
{
"name": "ts_create_time",
"type": {
"type": "string",
"connect.version": 1,
"connect.name": "io.debezium.time.ZonedTimestamp"
}
},
{
"name": "ts_modify_time",
"type": {
"type": "string",
"connect.version": 1,
"connect.name": "io.debezium.time.ZonedTimestamp"
}
},
{
"name": "version",
"type": [
"null",
"int"
]
},
{
"name": "settings",
"type": [
"null",
{
"type": "string",
"connect.version": 1,
"connect.name": "io.debezium.data.Json"
}
]
}
],
"connect.name": "apps.Value"
}
]
},
{
"name": "after",
"type": [
"null",
"Value"
]
},
{
"name": "source",
"type": {
"type": "record",
"name": "Source",
"namespace": "io.debezium.connector.mysql",
"fields": [
{
"name": "name",
"type": "string"
},
{
"name": "server_id",
"type": "long"
},
{
"name": "ts_sec",
"type": "long"
},
{
"name": "gtid",
"type": [
"null",
"string"
]
},
{
"name": "file",
"type": "string"
},
{
"name": "pos",
"type": "long"
},
{
"name": "row",
"type": "int"
},
{
"name": "snapshot",
"type": [
"null",
"boolean"
]
},
{
"name": "thread",
"type": [
"null",
"long"
]
},
{
"name": "db",
"type": [
"null",
"string"
]
},
{
"name": "table",
"type": [
"null",
"string"
]
}
],
"connect.name": "io.debezium.connector.mysql.Source"
}
},
{
"name": "op",
"type": "string"
},
{
"name": "ts_ms",
"type": [
"null",
"long"
]
}
],
"connect.version": 1,
"connect.name": "apps.Envelope"
}
I believe that the fields that are NULLABLE with a default NULL (name, user_id, group_id, etc) in the above MySQL schema should have a "default": null after the type field. They do not. This is lossy behavior, and makes future backwards compatible changes in MySQL incompatible once they get translated to Avro.
See DBZ-190 for a similar problem as well.