-
Bug
-
Resolution: Done
-
Major
-
None
-
None
-
False
-
False
-
Undefined
-
I'm doing some testing to see how can we can aggregate some information from different topics. For experimenting with this, I have created a connector which turns transaction metadata on and also started listening to the transaction topic.
To test, I run the commands sequence on sqlplus for the demo database:
SQL> INSERT INTO customers VALUES(NULL, 'Alberto', 'Pena', 'berto@gmail.com'); 1 row created. SQL> INSERT INTO orders VALUES(null, '10-FEB-21', 1003, 5, 102); 1 row created. SQL> COMMIT; Commit complete.
As mentioned, I have debezium set up and I do also have a command line kafka consumer to debug the data that goes through the different topics. This is what I do get immediately after running the above sql script ( I am omitting schemas for the sake of verbosity ):
transaction topic
{"payload":{"status":"END","id":"0500190077020000","event_count":1,"data_collections":[{"data_collection":"ORCLPDB1.DEBEZIUM.CUSTOMERS","event_count":1}]} {"payload":{"status":"BEGIN","id":"04001E0088020000","event_count":null,"data_collections":null}
A couple of things to note here:
- Note that the messages arrive in exactly that order, i.e. END arrives first and then BEGIN. I'm not sure if that is how it is expected to be. Documentation certainly shows first BEGIN and finally END.
- Strangely, the id is different. The id of the BEGIN event matches the tx ids that I see on the table topics. However, the id of the END event, I have no idea where that comes from.
- event_count is 1 on the END event, but that is probably related to the fact that the event seems to come from a totally unrelated transaction.
customers topic
{"payload": {"before":null,"after":{"ID":1051,"FIRST_NAME":"Alberto","LAST_NAME":"Pena","EMAIL":"berto@gmail.com"},"source":{"version":"1.4.1.Final","connector":"oracle","name":"dbzserver","ts_ms":1612970195000,"snapshot":"false","db":"ORCLPDB1","schema":"DEBEZIUM","table":"CUSTOMERS","txId":"04001E0088020000","scn":2862750,"commit_scn":2858419,"lcr_position":null},"op":"c","ts_ms":1612970199318,"transaction":{"id":"04001E0088020000","total_order":1,"data_collection_order":1}}}
orders topic
{"payload": {"before":null,"after":{"ID":10024,"ORDER_DATE":1612915200000,"PURCHASER":1003,"QUANTITY":5,"PRODUCT_ID":102},"source":{"version":"1.4.1.Final","connector":"oracle","name":"dbzserver","ts_ms":1612970195000,"snapshot":"false","db":"ORCLPDB1","schema":"DEBEZIUM","table":"ORDERS","txId":"04001E0088020000","scn":2862857,"commit_scn":2862869,"lcr_position":null},"op":"c","ts_ms":1612970199321,"transaction":{"id":"04001E0088020000","total_order":2,"data_collection_order":1}}}
In the individual transactions, total_order looks good, data_collection_order I'm not sure as the documentation states: "data_collection_order - the per-data collection position of the event among all events that were emitted by the transaction". Note that in both the tx id is 04001E0088020000 which matches the BEGIN event in the transaction topic.