-
Enhancement
-
Resolution: Done
-
Major
-
None
-
None
-
False
-
-
False
-
-
Currently, the ExtractChangedRecordState is only applied to events that contain both a before and after field in the event, therefore only updates.
On the surface, this doesn't seem like an issue, but depending on what other transforms this one is paired with an the pipeline this can lead to inconsistent expectations. For example, given the following transform setup:
"transforms.changes.type": "io.debezium.transforms.ExtractChangedRecordState", "transforms.changes.header.changed.name": "Changed", "transforms.moveHeadersToValue.type": "io.debezium.transforms.HeaderToValue", "transforms.moveHeadersToValue.headers": "Changed", "transforms.moveHeadersToValue.fields": "changes", "transforms.moveHeadersToValue.operation": "move", "transforms.unwrap.type": "io.debezium.transforms.ExtractNewRecordState", "transforms.unwrap.add.fields": "source.table:META_SRC_TABLENAME,source.ts_ms:META_SRC_TS_MS,ts_ms:META_TS_MS,ts_us:META_TS_US,op:META_SRC_OP,source.scn:META_SRC_SCN,source.snapshot:META_SRC_SNAPSHOT,source.user_name:META_SRC_USER,changes:META_SRC_CHANGES",
When a snapshot event is first observed, it travels through the transform chain and ultimately, the event will contain no schema field for __META_SRC_CHANGES. This is because ExtractChangedRecordState does not add any headers since a snapshot event is left untouched, and this creates a ripple effect where the ExtractNewRecordState won't add the defined field because the schema the mapping is sourced from does not have that field.
On an update, the Changes header is added, even if the update changes no values. This means that it propagates through the transform chain where the __META_SRC_CHANGES field is present in the schema, even if its value is omitted because its empty/null.
This creates an inconsistency in the Schema associated with the final event that we likely shouldn't observe in practice.
I propose that ExtractChangedRecordState should always add any configured header even if there are no fields that changed or are unchanged, thus adding an empty array. This keeps its behavior consistent across all operation types.
- links to
-
RHEA-2025:154266 Red Hat build of Debezium 3.2.4 release