Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-5637

MySql connector doesn't honor database.include.list during initial snapshot

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not a Bug
    • Major
    • None
    • None
    • mysql-connector
    • None
    • False
    • None
    • False

    Description

      In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.

      Bug report

      For bug reports, provide this information, please:

      What Debezium connector do you use and what version?

      debezium-connector-mysql 1.9.5

      What is the connector configuration?

       

      {
        "connector.class": "io.debezium.connector.mysql.MySqlConnector",
        "database.user": "${file:/opt/cdc/secret:user}",
        "database.history.kafka.bootstrap.servers": "kafka-headless.message-broker-dev.svc.cluster.local:9093",
        "database.history.kafka.topic": "dev-debezium.dbhistory.auth",
        "database.server.name": "dev-debezium",
        "database.port": "3306",
        "include.schema.changes": "false",
        "database.connectionTimeZone": "Europe/Paris",
        "database.hostname": "xxxxxxx",
        "database.password": "****************************************",
        "name": "cdc.auth",
        "table.include.list": "office.auth",
        "database.include.list": "office"
      }  

       

      What is the captured database version and mode of depoyment?

      on-premise mysql 8.0.28

      What behaviour do you expect?

      I upgraded the mysql connector from version 1.8.1 to 1.9.5 and recreated the connector from scratch (because we had a db restore, so messed up binlog sync)

      I expect a snapshot on 1 table in `office` database and table structure analysis of all `office` tables

      What behaviour do you see?

      Instead of snapshotting the configured tables (office.auth), it scanned all databases and read every tables of all databases instead of only the filtered tables during step 2 -> 5. Maybe it properly filters the tables later at step 6+ but I have 2k databases with 500 tables each so we stopped it because it was taking way too much time.

      The 2000 databases are roughtly identical (database-per-tenant pattern).

      I noticed a few closed bugs regarding include.list in 1.9.0 changelog, not sure if related of not :

      https://issues.redhat.com/browse/DBZ-3952 

      https://issues.redhat.com/browse/DBZ-3679 

      Do you see the same behaviour using the latest relesead Debezium version?

      Not tested yet

      Do you have the connector logs, ideally from start till finish?

      Not all logs but comparison between 1.8.1 & 1.9.5 runs.

      1.8.1 snapshot logs :

       
      [2022-09-21 11:05:07,748] INFO No previous offset has been found (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:07,748] INFO According to the connector configuration both schema and data will be snapshotted (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:07,749] INFO Snapshot step 1 - Preparing (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:07,751] INFO Snapshot step 2 - Determining captured tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:07,752] INFO Read list of available databases (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:07,815] INFO      list of available databases is:  [information_schema, office, db1, db2, db3, ....]    <---- 2000 dbs here
      [2022-09-21 11:05:07,816] INFO Read list of available tables in each database (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:35,192] INFO     snapshot continuing with database(s): [office] (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:38,153] INFO Snapshot step 3 - Locking captured tables [office.auth] (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:38,157] INFO Snapshot step 4 - Determining snapshot offset (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:38,164] INFO Read binlog position of MySQL primary server (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:38,167] INFO      using binlog 'mysql-binlog.000114' at position '61621500' and gtid '611d93e0-373a-11ed-9f4f-4ec91402c6c0:1-19007909' (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:38,167] INFO Snapshot step 5 - Reading structure of captured tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:38,167] INFO All eligible tables schema should be captured, capturing: [office.auth, office.contrats, office.documents, office.version] (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:39,670] INFO Reading structure of database 'office' (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:41,043] INFO Snapshot step 6 - Persisting schema history (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:41,699] INFO Snapshot step 7 - Snapshotting data (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:41,701] INFO Snapshotting contents of 1 tables while still in transaction (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:41,701] INFO Exporting data from table 'office.auth' (1 of 1 tables) (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:41,713] INFO      For table 'office.auth' using select statement: 'SELECT `contact`, ... FROM `office`.`auth`' (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:42,220] INFO 380 records sent during previous 00:00:37.0, last recorded offset: {ts_sec=1663758342, file=mysql-binlog.000114, pos=61621500, gtids=611d93e0-373a-11ed-9f4f-4ec91402c6c0:1-19007909, snapshot=true} (io.debezium.connector.common.BaseSourceTask)
      [2022-09-21 11:05:42,922] INFO      Finished exporting 2133 records for table 'office.auth'; total duration '00:00:01.221' (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:42,928] INFO Snapshot - Final stage (io.debezium.pipeline.source.AbstractSnapshotChangeEventSource)
      [2022-09-21 11:05:42,928] INFO Snapshot ended with SnapshotResult [status=COMPLETED, offset=MySqlOffsetContext [sourceInfoSchema=Schema{io.debezium.connector.mysql.Source:STRUCT}, sourceInfo=SourceInfo [currentGtid=null, currentBinlogFilename=mysql-binlog.000114, currentBinlogPosition=61621500, currentRowNumber=0, serverId=0, sourceTime=2022-09-21T11:05:42.922Z, threadId=-1, currentQuery=null, tableIds=[office.auth], databaseName=office], snapshotCompleted=true, transactionContext=TransactionContext [currentTransactionId=null, perTableEventCount={}, totalEventCount=0], restartGtidSet=611d93e0-373a-11ed-9f4f-4ec91402c6c0:1-19007909, currentGtidSet=611d93e0-373a-11ed-9f4f-4ec91402c6c0:1-19007909, restartBinlogFilename=mysql-binlog.000114, restartBinlogPosition=61621500, restartRowsToSkip=0, restartEventsToSkip=0, currentEventLengthInBytes=0, inTransaction=false, transactionId=null, incrementalSnapshotContext =IncrementalSnapshotContext [windowOpened=false, chunkEndPosition=null, dataCollectionsToSnapshot=[], lastEventKeySent=null, maximumKey=null]]] (io.debezium.pipeline.ChangeEventSourceCoordinator)
       

      1.9.5 snapshot logs :
      [2022-09-21 11:05:07,748] INFO No previous offset has been found (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:07,748] INFO According to the connector configuration both schema and data will be snapshotted (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:07,749] INFO Snapshot step 1 - Preparing (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:07,751] INFO Snapshot step 2 - Determining captured tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 11:05:07,752] INFO Read list of available databases (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:07,815] INFO      list of available databases is: [information_schema, office, db1, db2, db3, ....]   <- 2000 databases
      [2022-09-21 11:05:07,816] INFO Read list of available tables in each database (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 11:05:35,192] INFO     snapshot continuing with database(s): [office] (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db1.auth to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db1.contrats to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db1.documents to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db1.version to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db2.auth to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db2.contrats to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db2.documents to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db2.version to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db2.auth to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db2.contrats to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db2.documents to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:51,051] INFO Adding table db2.version to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      ... <- 2000 database & tables
      [2022-09-21 07:29:59,561] INFO Snapshot step 3 - Locking captured tables [office.auth] (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:59,564] INFO Snapshot step 4 - Determining snapshot offset (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:29:59,570] INFO Read binlog position of MySQL primary server (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 07:29:59,574] INFO      using binlog 'mysql-binlog.000113' at position '97887719' and gtid '611d93e0-373a-11ed-9f4f-4ec91402c6c0:1-18910786' (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 07:29:59,574] INFO Snapshot step 5 - Reading structure of captured tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      [2022-09-21 07:30:00,475] INFO All eligible tables schema should be captured, capturing: [office.auth, office.contrats, office.documents, office.version, db1.auth, db1.contrats, db1.documents, db1.version, db2.auth, db2.contrats, db2.documents, db2.version, ...] (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 07:30:01,333] INFO Reading structure of database 'office' (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 07:30:03,815] INFO Reading structure of database 'db1' (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 07:30:06,101] INFO Reading structure of database 'db2' (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      [2022-09-21 07:30:09,249] INFO Reading structure of database 'db3' (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      ... <- 2000 database
       

      Attachments

        Activity

          People

            Unassigned Unassigned
            mrluje Vincent D (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: