-
Bug
-
Resolution: Done
-
Major
-
None
-
None
What Debezium connector do you use and what version?
Debezium version: 2.4.0-SNAPSHOT (c748fc494b4d1bd6acaf4fe2d7d885cb5562d2f3)
Using debezium-server: 2.4.0-SNAPSHOT (e9365d39e76c45f1c42855675c72aab8b0a7ee1d)
What is the connector configuration?
debezium.source.topic.prefix=yb-voyager debezium.source.database.server.name=yb-voyager debezium.source.database.url=jdbc:oracle:thin:@(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=127.0.0.1)(PORT=1521))(CONNECT_DATA=(SERVICE_NAME=ORCLCDB))) debezium.source.connector.class=io.debezium.connector.oracle.OracleConnector debezium.source.database.dbname=PLACEHOLDER debezium.source.schema.include.list=TEST_SCHEMA debezium.source.hstore.handling.mode=map debezium.source.database.history=io.debezium.relational.history.FileDatabaseHistory debezium.source.database.history.file.filename=data/history.dat debezium.source.schema.history.internal=io.debezium.storage.file.history.FileSchemaHistory debezium.source.schema.history.internal.file.filename=data/schema_history.json debezium.source.schema.history.internal.skip.unparseable.ddl=true debezium.source.schema.history.internal.store.only.captured.tables.ddl=true debezium.source.schema.history.internal.store.only.captured.databases.ddl=true debezium.source.include.schema.changes=false debezium.source.log.mining.batch.size.min=10000 debezium.source.log.mining.batch.size.max=100000 debezium.source.log.mining.batch.size.default=10000 debezium.source.log.mining.query.filter.mode=in debezium.source.log.mining.sleep.time.default.ms=200 debezium.source.log.mining.sleep.time.max.ms=400 debezium.source.max.batch.size=10000 debezium.source.max.queue.size=50000 debezium.source.query.fetch.size=10000 debezium.sink.type=http debezium.sink.http.url=https://rbaskets.in/web/dbzm-oracle-reproducer debezium.source.internal.log.mining.transaction.snapshot.boundary.mode=all debezium.source.database.pdb.name=ORCLPDB1 debezium.source.column.include.list=TEST_SCHEMA.TEST_NCHAR.*
What is the captured database version and mode of depoyment?
on-prem oracle CDB (in docker)
What behaviour do you expect?
values don't have extra single quotes that were introduced for escaping
What behaviour do you see?
values have extra single quotes
Do you see the same behaviour using the latest relesead Debezium version?
yes
Do you have the connector logs, ideally from start till finish?
How to reproduce the issue using our tutorial deployment?
Table at snapshot:
Columns NAME DATA TYPE NULL DEFAULT COMMENTS *ID NUMBER(38,0) No VAL NCHAR(10) Yes SQL> select * from test_nchar; ID VAL _____ _____________ 1 a'b 2 a'b\c
Table after streaming:
SQL> select * from test_nchar; ID VAL _____ _____________ 1 a'b 2 a'b\c 3 a'b 4 a'b\c
event for id=2(snapshot):
"payload": { "before": null, "after": { "ID": "Ag==", "VAL": "a'b\\c " }, "source": { "version": "2.4.0-SNAPSHOT", "connector": "oracle", "name": "yb-voyager", "ts_ms": 1695374963000, "snapshot": "last", "db": "ORCLPDB1", "sequence": null, "schema": "TEST_SCHEMA", "table": "TEST_NCHAR", "txId": null, "scn": "16477528", "commit_scn": null, "lcr_position": null, "rs_id": null, "ssn": 0, "redo_thread": null, "user_name": null }, "op": "r", "ts_ms": 1695374973278, "transaction": null }
event for id=4(streaming):
"payload": { "before": null, "after": { "ID": "BA==", "VAL": "a''b\\c " }, "source": { "version": "2.4.0-SNAPSHOT", "connector": "oracle", "name": "yb-voyager", "ts_ms": 1695375002000, "snapshot": "false", "db": "ORCLPDB1", "sequence": null, "schema": "TEST_SCHEMA", "table": "TEST_NCHAR", "txId": "03000800ac040000", "scn": "16510785", "commit_scn": "16510789", "lcr_position": null, "rs_id": "0x0001a0.00043126.01b8", "ssn": 0, "redo_thread": 1, "user_name": "TEST_SCHEMA" }, "op": "c", "ts_ms": 1695375024365, "transaction": null }
As you can see, the VAL column has an extra quote in the event for id=4(streaming), i.e. the single quote is escaped. the web link for the requests payload - https://rbaskets.in/web/dbzm-oracle-reproducer (token: G-vV0mTOx2s0XfRUcUWLIAEnq7nt-Uatqh8KD0zqIoZ5)
Did a little debugging; I believe the issue is because oracle produces the dml statement with the function UNISTR(val) for NCHAR columns, it ends up in this block in the LogminerDmlParser, which basically doesn't use the collectedValue which is unescaped, but instead uses it as-is.
- links to
-
RHEA-2024:129636 Red Hat build of Debezium 2.5.4 release