Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-8595

Slow Debezium startup for large number of tables

XMLWordPrintable

    • Important

      Seeing a huge startup time for Debezium to start with 3500 Tables, and 50 Columns each while testing Postgres Embedded Debezium v2.6 for my use case. Surprisingly, Debezium takes more than an hour to generate the first data event. I went ahead and started profiling it and saw that the highest time taken was in the two functions PostgresSchema.refreshSchemas() and JDBCConnection.readSchema(). Surprisingly, these functions are being called multiple times in the flow mainly in the `init` and `execute` methods of `PostgresStreamingChangeEventSource` class.

       

      More details in this thread: https://debezium.zulipchat.com/#narrow/channel/348249-community-postgresql/topic/Slow.20[…].20case.20of.20large.20number.20of.20table/near/482321074

      In order to make your issue reports as actionable as possible, please provide the following information, depending on the issue type.

      Bug report

      For bug reports, provide this information, please:

      What Debezium connector do you use and what version?

      v2.6.2

      What is the captured database version and mode of deployment?

      (E.g. on-premises, with a specific cloud provider, etc.)

      Postgres 16 RDS

      What behavior do you expect?

      Refresh schema being called once and lesser startup time.

      What behavior do you see?

      Refresh schema being called multiple times leading to high startup time.

              vjuranek@redhat.com Vojtech Juranek
              animesh_sharma Animesh Sharma
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: