-
Task
-
Resolution: Done
-
Major
-
2.7.1.Final, 3.0.0.Final
-
False
-
None
-
False
-
Important
Hi there, we are using debezium engine with Mariadb. Debezium version is 2.7.1.Final. I have also tried with 3.0.0.Final and I've got the same result. I have one connector that is reading from a host on which host there are multiple databases. I need to capture only one of them and I have specified it in the database.include.list.
Bug report
For bug reports, provide this information, please:
What Debezium connector do you use and what version?
2.7.1.Final . Same on 3.0.0.Final
What is the connector configuration?
// final Properties props = new Properties(); props.setProperty("name", applicationName); props.setProperty("bootstrap.servers", bootstrapServers); props.setProperty("offset.storage", "org.apache.kafka.connect.storage.KafkaOffsetBackingStore"); props.setProperty("offset.storage.topic", systemTopicPrefix + ".offset_storage"); props.setProperty("offset.storage.partitions", "1"); props.setProperty("offset.storage.replication.factor", "2"); props.setProperty("offset.flush.interval.ms", "20000"); // 20 seconds props.setProperty("offset.flush.timeout.ms", "10000"); // 10 seconds props.setProperty("topic.prefix", topicPrefix); props.setProperty("database.server.id", generateServerId(applicationName)); props.setProperty("schema.history.internal.kafka.bootstrap.servers", bootstrapServers); props.setProperty("schema.history.internal.kafka.topic", systemTopicPrefix + ".db_history"); props.setProperty("snapshot.max.threads", String.valueOf(snapshotMaxThreads)); props.setProperty("signal.enabled.channels", "kafka"); props.setProperty("signal.kafka.topic", systemTopicPrefix + ".debezium_signals"); props.setProperty("signal.kafka.bootstrap.servers", bootstrapServers); props.setProperty("connector.class", "io.debezium.connector.mariadb.MariaDbConnector"); props.setProperty("connector.adapter", "mariadb"); props.setProperty("database.hostname", debeziumSourceDatabaseConfig.getHostname()); props.setProperty("database.port", debeziumSourceDatabaseConfig.getPort()); props.setProperty("database.user", debeziumSourceDatabaseConfig.getUsername()); props.setProperty("database.password", debeziumSourceDatabaseConfig.getPassword()); props.setProperty("database.server.name", applicationName.replace('.', '_')); props.setProperty("database.history.skip.unparseable.ddl", String.valueOf(true)); props.setProperty("database.history.producer.max.request.size", String.valueOf(MAX_HISTORY_MESSAGE_SIZE)); props.setProperty("database.history.producer.buffer.memory", String.valueOf(MAX_HISTORY_MESSAGE_SIZE)); props.setProperty("database.history.consumer.max.partition.fetch.bytes", String.valueOf(MAX_HISTORY_MESSAGE_SIZE)); props.setProperty("database.history.store.only.captured.tables.ddl", String.valueOf(false)); props.setProperty("database.include.list", debeziumSourceDatabaseConfig.getSchemaWhitelist()); props.setProperty("database.timezone", "disable"); props.setProperty("database.protocol", "jdbc:mariadb"); props.setProperty("database.jdbc.driver", "org.mariadb.jdbc.Driver"); props.setProperty("database.ssl.mode", "disabled"); props.setProperty("snapshot.mode", debeziumSourceDatabaseConfig.getSnapshotMode()); props.setProperty("include.schema.changes", "true"); props.setProperty("connect.keep.alive", "true"); props.setProperty("connections.max.idle.ms", "540000");// props.setProperty("schema.history.internal.store.only.captured.databases.ddl", "true"); //-> if set to true then its working. If this setting is missing I get the error described if (!debeziumSourceDatabaseConfig.getTablesWhitelist().isEmpty()) { props.setProperty("table.include.list", debeziumSourceDatabaseConfig.getTablesWhitelist()); } if (!debeziumSourceDatabaseConfig.getTablesBlacklist().isEmpty()) { props.setProperty("table.exclude.list", debeziumSourceDatabaseConfig.getTablesBlacklist()); } return props;
What is the captured database version and mode of deployment?
aws rds mariadb
What behavior do you expect?
I expect that this setting should by default be true and will not track any databases that I am not interested in schema.history.internal.store.only.captured.databases.ddl.
Also there shouldn't be an issue with the ddl parsing
What behavior do you see?
On initial snapshot I'am receiving errors during the snapshot. From the logs i can see that my connector is trying to read the structure of the databases that are not in the schema.include.list
11:42:42.382 INFO io.debezium.connector.binlog.BinlogSnapshotChangeEventSource - Reading structure of database 'table2'
11:42:42.387 INFO io.debezium.connector.binlog.BinlogSnapshotChangeEventSource - Reading structure of database 'table3'
11:42:42.387 INFO io.debezium.connector.binlog.BinlogSnapshotChangeEventSource - Reading structure of database 'table4'
11:42:42.633 INFO io.debezium.connector.binlog.BinlogSnapshotChangeEventSource - Reading structure of database 'table5'
These are all tables from the same host that i don't want to capture. Upon reading some of the other tables i get the following error
// io.debezium.text.ParsingException: DDL statement couldn't be parsed. Please open a Jira issue with the statement 'CREATE TABLE `table2` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `customer_id` bigint(20) NOT NULL, `year` smallint(5) unsigned DEFAULT NULL, `document_uuid` uuid NOT NULL, `generated_at` datetime NOT NULL, `account_id` bigint(20) NOT NULL, `first_name` varchar(255) DEFAULT NULL, `last_name` varchar(255) DEFAULT NULL, `address` varchar(255) DEFAULT NULL, `zipcode` varchar(255) DEFAULT NULL, `city` varchar(255) DEFAULT NULL, `profit` decimal(18,2) NOT NULL, `loss` decimal(18,2) NOT NULL, `capital_tax` decimal(18,2) NOT NULL, `solidarity_tax` decimal(18,2) NOT NULL, `church_tax` decimal(18,2) NOT NULL, `exemption_order` decimal(18,2) NOT NULL, PRIMARY KEY (`id`), UNIQUE KEY `UK_certificate_info` (`year`,`customer_id`) ) ENGINE=InnoDB AUTO_INCREMENT=1423 DEFAULT CHARSET=utf8mb3 COLLATE=utf8mb3_general_ci' no viable alternative at input 'CREATE TABLE `table2` (\n `id` bigint(20) NOT NULL AUTO_INCREMENT,\n `customer_id` bigint(20) NOT NULL,\n `year` smallint(5) unsigned DEFAULT NULL,\n `document_uuid` uuid'
The question here is why my connector is reading the structure of the dabases that I am not interested in? I have not configured this setting
schema.history.internal.store.only.captured.databases.ddl. It should be true by default, but it looks like it's not working properly. When i explicitly set it to true, then i don't have issue and my other databases are not read. If i set it to false then I get the same error.
Do you see the same behaviour using the latest released Debezium version?
yes with 3.0.0.Final