-
Feature Request
-
Resolution: Unresolved
-
Major
-
None
-
None
-
False
-
-
False
The existing re-select post-processor can help with backfilling missing TOAST column values, but it has some critical shortcomings:
- It is prone to data races; if a TOAST column actually does get updated between the point in time a change event without the TOAST value got created and the point in time the re-select query is issued, the wrong value will be added to the enriched event
- It can create a high load on the source database if there are many updates to rows with unchanged TOAST columns, in particular as the SELECT isn't batched, i.e. it's all single key look-ups
It would therefore be interesting to explore an alternative post processor which manages the values to be backfilled in a local state store, for instance using RocksDB or SlateDB. This would ensure that always the correct value is backfilled (the previous value for that column for a given row) and it would avoid any performance impact on the database. The trade-off is that this implementation will have its own persistent state which requires management. A remote store like SlateDB will avoid any local state.