-
Bug
-
Resolution: Done
-
Major
-
2.6.0.Alpha2
-
None
Bug report
What Debezium connector do you use and what version?
Oracle 2.6.0.alpha2
What is the connector configuration?
{ "connector.class": "io.debezium.connector.oracle.OracleConnector", "tasks.max": "1", "log.mining.strategy": "online_catalog", "schema.history.internal.store.only.captured.databases.ddl": "true", "schema.history.internal.store.only.captured.tables.ddl": "true", "topic.prefix":***, "decimal.handling.mode": "double", "schema.history.internal.kafka.topic":***, "lob.enabled": "true", "value.converter": "io.confluent.connect.avro.AvroConverter", "key.converter": "io.confluent.connect.avro.AvroConverter", "database.user":***, "database.dbname":***, "database.connection.adapter": "logminer", "time.precision.mode": "connect", "database.server.name":***, "schema.history.internal.kafka.bootstrap.servers":***, "database.port": "1521", "value.converter.schema.registry.url": ***, "database.hostname":***, "database.password":***, "name":***, "table.include.list": "<some tables>", "key.converter.schema.registry.url": *** }
What is the captured database version and mode of depoyment?
Oracle Database 19c Standard Edition 2 Release 19.0.0.0.0 - Production
Version 19.19.0.0.0
What behaviour do you expect?
When lob.enabled=true is set in the Debezium Oracle Connector configuration, the connector is expected to correctly handle and retrieve XML data stored in XMLTYPE columns without any issues.
What behaviour do you see?
With `lob.enabled=true`, the Debezium Oracle Connector fails with a `io.debezium.text.ParsingException` when attempting to process some records that contain XMLTYPE data. This issue prevents the correct retrieval and propagation of XML data, leading to failures in data synchronization and impacting the reliability of the data pipeline.
Do you see the same behaviour using the latest relesead Debezium version?
Yes
Do you have the connector logs, ideally from start till finish?
(You might be asked later to provide DEBUG/TRACE level log)
Cannot provide full logs due to data privacy, here is the relevant snippet:
[2024-02-14 14:17:55,411] ERROR [oracle-cdc-source-slst-stage-011|task-0] Producer failure (io.debezium.pipeline.ErrorHandler:52) io.debezium.text.ParsingException: Parsing failed for SQL: 'XML DOC BEGIN: select "PROPERTIES" from "SCHEMA"."TABLE" where "COLUMN_A" = '314107' and "COLUMN_B" = '69265' and "COLUMN_D" = '74' and "TIME_A" = TO_TIMESTAMP_TZ('2024-02-14 10:58:02.202590 +01:00') and "TIME_B" = TO_TIMESTAMP_TZ('3000-01-01 00:00:00.000000 +00:00') and "MODIFICATIONTIME" IS NULL' at io.debezium.connector.oracle.logminer.parser.PreambleSingleColumnReconstructedSelectParser.parse(PreambleSingleColumnReconstructedSelectParser.java:79) at io.debezium.connector.oracle.logminer.processor.AbstractLogMinerEventProcessor.lambda$handleXmlBegin$7(AbstractLogMinerEventProcessor.java:908) at io.debezium.connector.oracle.logminer.processor.memory.MemoryLogMinerEventProcessor.addToTransaction(MemoryLogMinerEventProcessor.java:205) at io.debezium.connector.oracle.logminer.processor.AbstractLogMinerEventProcessor.handleXmlBegin(AbstractLogMinerEventProcessor.java:905) at io.debezium.connector.oracle.logminer.processor.AbstractLogMinerEventProcessor.processRow(AbstractLogMinerEventProcessor.java:372) at io.debezium.connector.oracle.logminer.processor.AbstractLogMinerEventProcessor.processResults(AbstractLogMinerEventProcessor.java:306) at io.debezium.connector.oracle.logminer.processor.AbstractLogMinerEventProcessor.process(AbstractLogMinerEventProcessor.java:234) at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:248) at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:62) at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:272) at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:195) at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:138) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:829) Caused by: io.debezium.text.ParsingException: Failed to parse operator at index 310: XML DOC BEGIN: select "PROPERTIES" from "SCHEMA"."TABLE" where "COLUMN_A" = '314107' and "COLUMN_B" = '69265' and "COLUMN_D" = '74' and "TIME_A" = TO_TIMESTAMP_TZ('2024-02-14 10:58:02.202590 +01:00') and "TIME_B" = TO_TIMESTAMP_TZ('3000-01-01 00:00:00.000000 +00:00') and "MODIFICATIONTIME" IS NULL at io.debezium.connector.oracle.logminer.parser.PreambleSingleColumnReconstructedSelectParser.parseOperator(PreambleSingleColumnReconstructedSelectParser.java:205) at io.debezium.connector.oracle.logminer.parser.PreambleSingleColumnReconstructedSelectParser.parseWhereClause(PreambleSingleColumnReconstructedSelectParser.java:158) at io.debezium.connector.oracle.logminer.parser.PreambleSingleColumnReconstructedSelectParser.parse(PreambleSingleColumnReconstructedSelectParser.java:73)
Comments
Compiling and running the `parseWhereClause` logic from `PreambleSingleColumnReconstructedSelectParser.parse`, I could determine that this error gets thrown when the `where` clause ends with `IS NULL`, with no trailing whitespace. In this cases, the column value for the filter in question is parsed as `NUL` and not `NULL`.
It does NOT occur if:
- the where clause ends with a different filter e.g.
where "COLUMN_A" = '314107' and "COLUMN_B" = '69265' and "COLUMN_D" = '74' and "MODIFICATIONTIME" IS NULL and "TIME_A" = TO_TIMESTAMP_TZ('2024-02-14 10:58:02.202590 +01:00') and "TIME_B" = TO_TIMESTAMP_TZ('3000-01-01 00:00:00.000000 +00:00')
- or if the query ends with a single space
where "COLUMN_A" = '314107' and "COLUMN_B" = '69265' and "COLUMN_D" = '74' and "TIME_A" = TO_TIMESTAMP_TZ('2024-02-14 10:58:02.202590 +01:00') and "TIME_B" = TO_TIMESTAMP_TZ('3000-01-01 00:00:00.000000 +00:00') and "MODIFICATIONTIME" IS NULL
in these cases, the column value for the `MODIFICATIONTIME` filter is correctly parsed as `NULL` and the operator parsing exception is not thrown
- causes
-
DBZ-7598 Serialization of XML columns with NULL values fails using Infinispan Buffer
- Closed
- links to
-
RHEA-2024:129636 Red Hat build of Debezium 2.5.4 release