Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-8272

Better support multi-topic writes on tables with foreign key constraints

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: Minor Minor
    • 3.1-plan
    • None
    • jdbc-connector
    • None
    • False
    • None
    • False
    • Critical

      In DBZ-8130 it was reported:

      In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.

      Bug report

      ...

      What is the captured database version and mode of deployment?

      MySql 8.0, MySql 8.4

      What behavior do you expect?

      Context for source : All the tables cdc changes are rerouted from source to single kafka topic. The topic has correct ordering of the messages from the source. 

      When consuming batch of records from kafka topics, the insertion/deletion order for various tables' cdc changes should be same as that in kafka records order. 

      What behavior do you see?

      Foreign Key Error as there is reordering of tables during insertion time.
      If a parent table cdc messsage is earlier than child table cdc message.
      If the batch includes both the tables, it is inserting the child table first causing the foreign key error.
      This Foreign key error is unpredicatable and will depend on tableId's hashcode created by `updateBuffer` hashmap and `deleteBuffer` hashmap.

      Do you see the same behaviour using the latest released Debezium version?

      Possibly. Haven't tested it out with v3.

      How to reproduce the issue using our tutorial deployment?

      I couldn't find any tutorial jdbc sink connectors example.

      Create two tables with name `OpenIddictApplications` and `OpenIddictAuthorizations` in SQL server. Add a foreign key relationship between both tables where parent is `OpenIddictApplications` and child is `OpenIddictAuthorizations`.

      Ingest it into one topic using `ByLogicalRouter`.

      Consume it via debezium jdbc sink connector into mysql. 

      Implementation ideas (optional)

      Using `LinkedHashMap` instead of `HashMap` for `updateBuffer` and `deleteBuffer` to preserve the ordering of tables during insertion and deletion.

      Could we look into if all JDBC datastores that we currently support are offering an option to disable foreign key checks temporary (for the JDBC sink connections)?
      MySQL for example has:

      SET FOREIGN_KEY_CHECKS=0;

      MariaDB:

      SET SESSION foreign_key_checks=OFF;

      Postgres has

      SET session_replication_role = 'replica'; SET session_replication_role = 'origin';

      I found Oracle might have

      DISABLE CONSTRAINT

      for update/insert statements, but I am not sure. It's hard to find info about that.
      SQL Server only has

      ALTER TABLE tableName NOCHECK CONSTRAINT constraintname
      
      -- OR:
      
      ALTER TABLE tableName NOCHECK CONSTRAINT ALL
      
      -- OR disable all constraints for all tables:
      EXEC sp_MSforeachtable "ALTER TABLE ? NOCHECK CONSTRAINT all"
      

      it seems. Which might influence the server?!

      That way it would be possible to support multi-topic writes to tables that have foreign key constraints but their parent rows arrive later.

      That behavior is often used in data migration systems to get around this error and would make this scenario convenient without the requirement to have a (single partitioned) topic for all you data (which is also considered an anti-pattern in Kafka for example).

              rk3rn3r René Kerner
              rk3rn3r René Kerner
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated: