Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-7438

MySQL Connector keeps on creating new snapshots every time connector is started

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • under-triaging
    • None
    • mysql-connector
    • None
    • False
    • None
    • False

    Description

      Bug report

      What Debezium connector do you use and what version?

      We use MySQL connector, version - 1.9.6.Final. We use strimzi for deployment of the connector.

      What is the connector configuration?

      {
        "properties": {
          "connector.class": "io.debezium.connector.mysql.MySqlConnector",
          "database.history.producer.ssl.keystore.key": "${env:CERT_KEY}",
          "tasks.max": "1",
          "database.history.kafka.topic": "projects-iaconnector.schema.history",
          "transforms": "filter",
          "database.history.producer.ssl.keystore.certificate.chain": "${env:CERT_PEM}",
          "database.history.consumer.security.protocol": "SSL",
          "include.schema.changes": "false",
          "database.history.producer.ssl.keystore.type": "PEM",
          "value.converter": "org.apache.kafka.connect.json.JsonConverter",
          "key.converter": "org.apache.kafka.connect.json.JsonConverter",
          "database.history.consumer.ssl.keystore.certificate.chain": "${env:CERT_PEM}",
          "database.user": "${env:PROJECTS_USER}",
          "database.history.consumer.ssl.keystore.key": "${env:CERT_KEY}",
          "database.server.id": "184056",
          "database.history.producer.security.protocol": "SSL",
          "database.history.kafka.bootstrap.servers": "node1.amazonaws.com:9094,node2.amazonaws.com:9094,node3.amazonaws:9094",
          "database.server.name": "projects-iaconnector",
          "transforms.filter.language": "jsr223.groovy",
          "transforms.filter.type": "io.debezium.transforms.Filter",
          "database.history.consumer.ssl.keystore.type": "PEM",
          "database.history.consumer.ssl.truststore.certificates": "${env:CERT_CA}",
          "database.history.producer.ssl.truststore.certificates": "${env:CERT_CA}",
          "database.port": "3306",
          "key.converter.schemas.enable": "false",
          "transforms.filter.condition": "!(value?.after?.display_name =~ 'crud-test*') && !(value?.before?.display_name =~ 'crud-test*')",
          "task.class": "io.debezium.connector.mysql.MySqlConnectorTask",
          "database.hostname": "<redacted>",
          "database.password": "${env:PROJECTS_PASS}",
          "value.converter.schemas.enable": "false",
          "name": "projects-iaconnector",
          "table.include.list": "projects.project",
          "database.history.consumer.ssl.truststore.type": "PEM",
          "database.include.list": "projects",
          "snapshot.mode": "when_needed",
          "database.history.producer.ssl.truststore.type": "PEM"
        }
      } 

      What is the captured database version and mode of depoyment?

      (E.g. on-premises, with a specific cloud provider, etc.)

      We use AWS Aurora, engine version - 8.0.mysql_aurora.3.03.0. Single primary instance.

      What behaviour do you expect?

      Connector should not create snapshots every time it starts. It should only create snapshot when it is needed.

      What behaviour do you see?

      The last record from offsets topic returns this data:

      ["projects-iaconnector",{"server":"projects-iaconnector"}] :
      {
        "ts_sec": 1706780138,
        "file": "mysql-bin-changelog.000294",
        "pos": 693603,
        "snapshot": true
      }

      Then, in the sourcecode, there is this fragment of code:

      // found a previous offset and the earlier snapshot has completed
      if (previousOffset != null && !previousOffset.isSnapshotRunning()) {
          LOGGER.info("A previous offset indicating a completed snapshot has been found. Neither schema nor data will be snapshotted.");
          snapshotSchema = false;
          snapshotData = false;
      }
      else {
          LOGGER.info("No previous offset has been found");
          ...
      } 

      And this statement:

      previousOffset.isSnapshotRunning() 

      Returns true, therefore a new snapshot is being created, even though the snapshot has been completed before (see logs below):

      INFO [projects-iaconnector|task-0] Snapshot ended with SnapshotResult [status=COMPLETED ...

      To sum up - when we restart the connector, it will produce this record:

      ["projects-iaconnector",{"server":"projects-iaconnector"}] :
      {   "ts_sec": 1706780138,   "file": "mysql-bin-changelog.000294",   "pos": 693603,   "snapshot": true } 

      And perform a snapshot.

      Furthermore, when the binlog file is missing - for example when mysql-bin-changelog.000294 is deleted, connector fails with this error:

      io.debezium.DebeziumException: Could not find first log file name in binary log index file Error code: 1236; SQLSTATE: HY000. 

      Which seems to be an incorrect behavior - it should create a snapshot when bin log is missing based on a snapshot mode:

      "snapshot.mode": "when_needed", 

      Do you see the same behaviour using the latest relesead Debezium version?

      (Ideally, also verify with latest Alpha/Beta/CR version)

      We haven't checked that yet, unfortunately.

      Do you have the connector logs, ideally from start till finish?

      (You might be asked later to provide DEBUG/TRACE level log)

       2024-02-01 10:44:07,931 INFO [projects-iaconnector|task-0] Context created (io.debezium.pipeline.ChangeEventSourceCoordinator) [debezium-mysqlconnector-projects-iaconnector-change-event-source-coordinator]
      2024-02-01 10:44:07,931 INFO [projects-iaconnector|task-0] No previous offset has been found (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      2024-02-01 10:44:07,931 INFO [projects-iaconnector|task-0] According to the connector configuration both schema and data will be snapshotted (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      2024-02-01 10:44:07,931 INFO [projects-iaconnector|task-0] Snapshot step 1 - Preparing (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:07,931 INFO [projects-iaconnector|task-0] Previous snapshot was cancelled before completion; a new snapshot will be taken. (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:07,936 INFO [projects-iaconnector|task-0] Snapshot step 2 - Determining captured tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:07,936 INFO [projects-iaconnector|task-0] Read list of available databases (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      2024-02-01 10:44:07,939 INFO [projects-iaconnector|task-0]   list of available databases is: [<redacted>]
      2024-02-01 10:44:07,939 INFO [projects-iaconnector|task-0] Read list of available tables in each database (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      2024-02-01 10:44:08,200 INFO [projects-iaconnector|task-0]  snapshot continuing with database(s): [projects] (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      2024-02-01 10:44:08,201 INFO [projects-iaconnector|task-0] Adding table projects.project to the list of capture schema tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:08,204 INFO [projects-iaconnector|task-0] Snapshot step 3 - Locking captured tables [projects.project] (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:08,208 INFO [projects-iaconnector|task-0] Flush and obtain global read lock to prevent writes to database (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      2024-02-01 10:44:08,216 INFO [projects-iaconnector|task-0] Unable to flush and acquire global read lock, will use table read locks after reading table names (io.debezium.connector.mysql
      2024-02-01 10:44:08,218 INFO [projects-iaconnector|task-0] Snapshot step 4 - Determining snapshot offset (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:08,218 INFO [projects-iaconnector|task-0] Snapshot step 5 - Reading structure of captured tables (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:08,219 INFO [projects-iaconnector|task-0] Flush and obtain read lock for [projects.project] tables (preventing writes) (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      2024-02-01 10:44:08,221 INFO [projects-iaconnector|task-0] Table level locking is in place, the schema will be capture in two phases, now capturing: [projects.project] (io.debezium.connector.mysql
      2024-02-01 10:44:08,221 INFO [projects-iaconnector|task-0] Tables for delayed schema capture: [<redacted>]
      2024-02-01 10:44:08,221 INFO [projects-iaconnector|task-0] All eligible tables schema should be captured, capturing: [<redacted>]
      2024-02-01 10:44:08,259 INFO [projects-iaconnector|task-0] Reading structure of database 'projects' (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      2024-02-01 10:44:08,263 INFO [projects-iaconnector|task-0] Snapshot step 6 - Persisting schema history (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:08,303 INFO [projects-iaconnector|task-0] [Producer clientId=projects-iaconnector-dbhistory] Resetting the last seen epoch of partition projects-iaconnector.schema.history-0 to 24 since the
      2024-02-01 10:44:08,387 WARN [projects-iaconnector|task-0] Tables were locked explicitly, but to get a consistent snapshot we cannot release the locks until we've read all tables. (io.debezium.connector
      2024-02-01 10:44:08,387 INFO [projects-iaconnector|task-0] Snapshot step 7 - Snapshotting data (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:08,387 INFO [projects-iaconnector|task-0] Snapshotting contents of 1 tables while still in transaction (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:08,387 INFO [projects-iaconnector|task-0] Exporting data from table 'projects.project' (1 of 1 tables) (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:08,387 INFO [projects-iaconnector|task-0]   For table 'projects.project' using select statement: 'SELECT `rid`, `id`, `display_name`, [<redacted>]
      2024-02-01 10:44:08,432 INFO [projects-iaconnector|task-0] 107 records sent during previous 00:00:19.434, last recorded offset: {ts_sec=1706784248, file=mysql-bin-changelog.000294, pos=693603,
      2024-02-01 10:44:08,891 INFO [projects-iaconnector|task-0] [Producer clientId=connector-producer-projects-iaconnector-0] Resetting the last seen epoch of partition projects-iaconnector.projects.project-0 to
      2024-02-01 10:44:08,891 INFO [projects-iaconnector|task-0] [Producer clientId=connector-producer-projects-iaconnector-0] Resetting the last seen epoch of partition projects-iaconnector.projects.project-5 to
      2024-02-01 10:44:08,891 INFO [projects-iaconnector|task-0] [Producer clientId=connector-producer-projects-iaconnector-0] Resetting the last seen epoch of partition projects-iaconnector.projects.project-1 to
      2024-02-01 10:44:08,891 INFO [projects-iaconnector|task-0] [Producer clientId=connector-producer-projects-iaconnector-0] Resetting the last seen epoch of partition projects-iaconnector.projects.project-4 to
      2024-02-01 10:44:08,891 INFO [projects-iaconnector|task-0] [Producer clientId=connector-producer-projects-iaconnector-0] Resetting the last seen epoch of partition projects-iaconnector.projects.project-6 to
      2024-02-01 10:44:08,891 INFO [projects-iaconnector|task-0] [Producer clientId=connector-producer-projects-iaconnector-0] Resetting the last seen epoch of partition projects-iaconnector.projects.project-7 to
      2024-02-01 10:44:08,891 INFO [projects-iaconnector|task-0] [Producer clientId=connector-producer-projects-iaconnector-0] Resetting the last seen epoch of partition projects-iaconnector.projects.project-2 to
      2024-02-01 10:44:08,891 INFO [projects-iaconnector|task-0] [Producer clientId=connector-producer-projects-iaconnector-0] Resetting the last seen epoch of partition projects-iaconnector.projects.project-3 to
      2024-02-01 10:44:18,397 INFO [projects-iaconnector|task-0]   Exported 115991 of 227286 records for table 'projects.project' after 00:00:10.01 (io.debezium.relational.RelationalSnapshotChangeEventSource)
      2024-02-01 10:44:24,011 INFO [projects-iaconnector|task-0] 219136 records sent during previous 00:00:15.579, last recorded offset: {ts_sec=1706784248, file=mysql-bin-changelog.000294, pos=693603,
      2024-02-01 10:44:24,600 INFO [projects-iaconnector|task-0]   Finished exporting 235171 records for table 'projects.project'; total duration '00:00:16.213' (io.debezium.relational
      2024-02-01 10:44:24,607 INFO [projects-iaconnector|task-0] Releasing table read lock to enable MySQL writes (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      2024-02-01 10:44:24,609 INFO [projects-iaconnector|task-0] Writes to MySQL tables prevented for a total of 00:00:16.388 (io.debezium.connector.mysql.MySqlSnapshotChangeEventSource)
      2024-02-01 10:44:25,500 INFO [projects-iaconnector|task-0] Already applied 5063 database changes (io.debezium.relational.history.DatabaseHistoryMetrics)
      2024-02-01 10:44:25,905 INFO [projects-iaconnector|task-0] Snapshot - Final stage (io.debezium.pipeline.source.AbstractSnapshotChangeEventSource)
      2024-02-01 10:44:25,905 INFO [projects-iaconnector|task-0] Snapshot ended with SnapshotResult [status=COMPLETED, offset=MySqlOffsetContext [sourceInfoSchema=Schema{io.debezium.connector.mysql
      2024-02-01 10:44:25,905 INFO [projects-iaconnector|task-0] Requested thread factory for connector MySqlConnector, id = projects-iaconnector named = binlog-client (io.debezium.util.Threads)
      2024-02-01 10:44:25,906 INFO [projects-iaconnector|task-0] Starting streaming (io.debezium.pipeline.ChangeEventSourceCoordinator)
      2024-02-01 10:44:25,910 INFO [projects-iaconnector|task-0] GTID set purged on server:  (io.debezium.connector.mysql.MySqlStreamingChangeEventSource)
      2024-02-01 10:44:25,910 INFO [projects-iaconnector|task-0] Skip 0 events on streaming start (io.debezium.connector.mysql.MySqlStreamingChangeEventSource)
      2024-02-01 10:44:25,910 INFO [projects-iaconnector|task-0] Skip 0 rows on streaming start (io.debezium.connector.mysql.MySqlStreamingChangeEventSource)
      2024-02-01 10:44:25,910 INFO [projects-iaconnector|task-0] Creating thread debezium-mysqlconnector-projects-iaconnector-binlog-client (io.debezium.util.Threads)
      2024-02-01 10:44:25,917 INFO [projects-iaconnector|task-0] Creating thread debezium-mysqlconnector-projects-iaconnector-binlog-client (io.debezium.util.Threads) [blc-<<redacted>>
      2024-02-01 10:44:25,924 INFO [projects-iaconnector|task-0] Connected to MySQL binlog at <<redacted>>, starting at MySqlOffsetContext [sourceInfoSchema=Schema{io
      2024-02-01 10:44:25,924 INFO [projects-iaconnector|task-0] Creating thread debezium-mysqlconnector-projects-iaconnector-binlog-client (io.debezium.util.Threads) [blc-<<redacted>>
      2024-02-01 10:44:25,924 INFO [projects-iaconnector|task-0] Waiting for keepalive thread to start (io.debezium.connector.mysql.MySqlStreamingChangeEventSource)
      2024-02-01 10:44:25,924 INFO [projects-iaconnector|task-0] Keepalive thread is running (io.debezium.connector.mysql.MySqlStreamingChangeEventSource)
      

      How to reproduce the issue using our tutorial deployment?

      -

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            abrzozo1 Aleksander Brzozowski
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: