Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-232

MySqlConnector Table and Database recommenders cause timeouts on large instances

    • Icon: Enhancement Enhancement
    • Resolution: Done
    • Icon: Major Major
    • 0.5.1
    • 0.5
    • mysql-connector
    • None

      I am getting unavoidable timeouts caused by the TableRecommender and DatabaseRecommender during config validation. Due to the size of my MySql instances the queries these classes run during validation are running into the Kafka Connect hardcoded 90 second timeout when trying to enumerate every database and every table.

      I'm currently using a patched build of debezium that removes these recommenders. Maybe these recommenders could be changed to just a short selection database and table names instead of everything?

      For my specific case I am dealing with MySql instances that contain hundreds of databases each with hundreds of tables. I'm happy to help try to write this code but I wanted to get some feedback on this. Unfortunately changing the size or configuration of the MySql instances is not a viable solution as it is outside the scope of my responsibilities.

            [DBZ-232] MySqlConnector Table and Database recommenders cause timeouts on large instances

            Bulk closing issues in state "Resolved" with resolution "Done" and with a released "Fix Version".

            Gunnar Morling added a comment - Bulk closing issues in state "Resolved" with resolution "Done" and with a released "Fix Version".

            Agreed, removing those recommenders seems to be the best option.

            Gunnar Morling added a comment - Agreed, removing those recommenders seems to be the best option.

            The pull request is ready for review and merging, and I've requested gunnar.morling do this.

            Randall Hauch (Inactive) added a comment - The pull request is ready for review and merging, and I've requested gunnar.morling do this.

            Added a pull request that removes the recommenders.

            It’s not clear how valuable these recommenders actually are. First, it’s not clear about the expected semantics: can the user use values that don’t appear in the recommended values? Second, the recommenders that return large numbers of values can be slow and can result in very large REST API responses.

            Debezium was using recommenders to return the database and table/collection names, but these lists can be very large for large databases. Rather than cap the number of recommended values and have the recommender return a subset of all potential values, we will instead remove the recommenders altogether.

            Randall Hauch (Inactive) added a comment - Added a pull request that removes the recommenders. It’s not clear how valuable these recommenders actually are. First, it’s not clear about the expected semantics: can the user use values that don’t appear in the recommended values? Second, the recommenders that return large numbers of values can be slow and can result in very large REST API responses. Debezium was using recommenders to return the database and table/collection names, but these lists can be very large for large databases. Rather than cap the number of recommended values and have the recommender return a subset of all potential values, we will instead remove the recommenders altogether.

              rhauch Randall Hauch (Inactive)
              tholmes_jira Thomas Holmes (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: