When capturing large databases with tens or hundreds of thousands of tables (as e.g. the case for multi-tenancy use cases frequently), the Debezium connectors will use a significant amount of memory to store the table metadata model.
This could likely be improved by abandoning the current design with a ColumnImpl object per table column in favor of a columnar design of TableImpl which stores the information per attribute type in arrays:
Access would be by index on the Table API:
Column indexes should be obtained once per incoming event ideally, avoiding repeated look-ups.
Validation of the approach could be done via JfrUnit by comparing/asserting the object allocations for a given large model. This would also prevent future regressions leading to increased memory usage again.