[DBZ-1541] Explore SMT for externalizing large column values

Type: Task
Resolution: Unresolved
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: core-library
Labels:
- easy-starter

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

This came up in a Twitter discussion: it could be useful to have an SMT which externalizes large BLOB/CLOB column values and propagates the external reference in data change events. The motivation is to avoid large messages in Apache Kafka.

One particular implementation could be based on Amazon S3: when creating a change event, the values of any configured large columns would be written to the S3 object storage, and in change events the corresponding field value would describe bucket name and object id. The object id should be based on the offset of the change event (and the column name, and optionally a before/after flag), so that the same id would be used when processing the same offset a second time after a connector restart.

Consumers might either resolve the reference, retrieve the referenced object and persist that in a sink datastore. More common usage would probably simply persist the object reference, pushing the object retrieval to readers of that sink datastore.

Eventually, multiple object stores may be supported, but S3 will be a good candidate for an initial PoC.

Gunnar Morling added a comment - 2021/03/01 5:25 AM

anmohant, should you be interested in picking up this one, a good way for local-only testing would be this S3 mock library: https://github.com/adobe/S3Mock.

Gunnar Morling added a comment - 2021/03/01 5:25 AM anmohant , should you be interested in picking up this one, a good way for local-only testing would be this S3 mock library: https://github.com/adobe/S3Mock .

Assignee:: Unassigned

Reporter:: Gunnar Morling

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2019/10/09 3:36 AM

Updated:: 2021/03/01 5:25 AM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

Collapse comment: Gunnar Morling added a comment - 2021/03/01 5:25 AM

Expand comment: Gunnar Morling added a comment - 2021/03/01 5:25 AM

People

Dates