Bug report
Debezium is not consuming data as fast as database data is being generated.
When the amount of database data is large, the time it takes for debezium to consume the data often lags behind the data generation time by tens of minutes, or even several hours.
As follows:
The database writing characteristics are shown in the figure below, and the above scenario occurs. This is only a common data writing scenario. When updating a table with a large amount of data, the debezium consumption data lags behind the data generation time by several hours.
What Debezium connector do you use and what version?
<1.9.4 mysql>
What is the connector configuration?
<
{
"name": "mysql-new_fpc_prod-dd_ods_fpc_erp_binlog_prod_1h-connector",
"config":
{
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"task.max": "1",
"database.hostname": "",
"database.port": "3306",
"database.dbname": "new_fpc_prod",
"database.user": "bi_canal_prod",
"database.password": "faf;efa;g3",
"database.server.id": "163",
"database.server.name": "debezium-prod-dd_ods_fpc_erp_binlog_prod_1h",
"database.include.list": "new_fpc_prod",
"table.include.list": "new_fpc_prod.fpcs_model_hc_staff_cost_rolling_summary_area_manual,new_fpc_prod.fpcs_model_nsc_detail_batch,new_fpc_prod.fpcs_model_nsc_detail_manual,new_fpc_prod.md_function,new_fpc_prod.md_group_area_relation,new_fpc_prod.md_infrastructure_hierarchy,new_fpc_prod.md_profit_center_store,new_fpc_prod.md_uo_hierarchy_asset,new_fpc_prod.md_uo_hierarchy_expense,new_fpc_prod.region_detail_model_month,new_fpc_prod.region_detail_model_month_history,new_fpc_prod.region_detail_model_version,new_fpc_prod.ud_stock_info,new_fpc_prod.ud_uo_matrix_city_data",
"database.history.kafka.bootstrap.servers": "kafka-001:9092,kafka-002:9092,kafka-003:9092",
"database.history.kafka.topic": "history-debezium-new_fpc_prod-dd_ods_fpc_erp_binlog_prod_1h",
"database.history.producer.sasl.mechanism": "PLAIN",
"database.history.producer.security.protocol": "SASL_PLAINTEXT",
"database.history.producer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"debezium\" password=\"fafafa\";",
"database.history.consumer.sasl.mechanism": "PLAIN",
"database.history.consumer.security.protocol": "SASL_PLAINTEXT",
"database.history.consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"debezium\" password=\"fafafa\";",
"include.schema.changes": "true",
"include.query": "true",
"snapshot.locking.mode": "none",
"snapshot.mode": "schema_only",
"topic.creation.default.replication.factor": "3",
"topic.creation.default.partitions": "3",
"topic.creation.default.compression.type": "lz4",
"bigint.unsigned.handling.mode": "precise",
"database.history.store.only.captured.tables.ddl": "true",
"database.history.skip.unparseable.ddl": "true",
"inconsistent.schema.handling.mode": "warn",
"database.history.kafka.recovery.poll.interval.ms": "600000",
"database.history.kafka.recovery.attempts": "3",
"boolean.type": "io.debezium.connector.mysql.converters.TinyIntOneToBooleanConverter",
"converters": "boolean"
}
}
>
What is the captured database version and mode of depoyment?
(E.g. on-premises, with a specific cloud provider, etc.)
<Mariadb>
What behaviour do you expect?
<I hope that the value of this indicator debezium_metrics_MilliSecondsBehindSource is small, and the speed of debezium consuming data can keep up with the speed of data generation>
Feature request or enhancement
Whether debezium can control the consumption rate by adjusting parameters through concurrent processing and other capabilities, so that when the amount of data is large, the speed of debezium consumption of data will not lag too far.
I would like to express my deep gratitude for your contributions. I believe that solving this problem will make debezium more powerful and easy to use.