Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-7489

Debezium Oracle Connector ParsingException on XMLTYPE with lob.enabled=true

XMLWordPrintable

    • Important

      Bug report

      What Debezium connector do you use and what version?

      Oracle 2.6.0.alpha2

      What is the connector configuration?

      {
          "connector.class": "io.debezium.connector.oracle.OracleConnector",
          "tasks.max": "1",
          "log.mining.strategy": "online_catalog",
          "schema.history.internal.store.only.captured.databases.ddl": "true",
          "schema.history.internal.store.only.captured.tables.ddl": "true",
          "topic.prefix":***,
          "decimal.handling.mode": "double",
          "schema.history.internal.kafka.topic":***,
          "lob.enabled": "true",
          "value.converter": "io.confluent.connect.avro.AvroConverter",
          "key.converter": "io.confluent.connect.avro.AvroConverter",
          "database.user":***,
          "database.dbname":***,
          "database.connection.adapter": "logminer",
          "time.precision.mode": "connect",
          "database.server.name":***,
          "schema.history.internal.kafka.bootstrap.servers":***,
          "database.port": "1521",
          "value.converter.schema.registry.url": ***,
          "database.hostname":***,
          "database.password":***,
          "name":***,
          "table.include.list": "<some tables>",
          "key.converter.schema.registry.url":  ***
      }

      What is the captured database version and mode of depoyment?

      Oracle Database 19c Standard Edition 2 Release 19.0.0.0.0 - Production
      Version 19.19.0.0.0

      What behaviour do you expect?

      When lob.enabled=true is set in the Debezium Oracle Connector configuration, the connector is expected to correctly handle and retrieve XML data stored in XMLTYPE columns without any issues.

      What behaviour do you see?

      With `lob.enabled=true`, the Debezium Oracle Connector fails with a `io.debezium.text.ParsingException` when attempting to process some records that contain XMLTYPE data. This issue prevents the correct retrieval and propagation of XML data, leading to failures in data synchronization and impacting the reliability of the data pipeline.

      Do you see the same behaviour using the latest relesead Debezium version?

      Yes

      Do you have the connector logs, ideally from start till finish?

      (You might be asked later to provide DEBUG/TRACE level log)

      Cannot provide full logs due to data privacy, here is the relevant snippet:
       

      [2024-02-14 14:17:55,411] ERROR [oracle-cdc-source-slst-stage-011|task-0] Producer failure (io.debezium.pipeline.ErrorHandler:52)
      io.debezium.text.ParsingException: Parsing failed for SQL: 'XML DOC BEGIN:  select "PROPERTIES" from "SCHEMA"."TABLE" where "COLUMN_A" = '314107' and "COLUMN_B" = '69265' and "COLUMN_D" = '74' and "TIME_A" = TO_TIMESTAMP_TZ('2024-02-14 10:58:02.202590 +01:00') and "TIME_B" = TO_TIMESTAMP_TZ('3000-01-01 00:00:00.000000 +00:00') and "MODIFICATIONTIME" IS NULL'
      at io.debezium.connector.oracle.logminer.parser.PreambleSingleColumnReconstructedSelectParser.parse(PreambleSingleColumnReconstructedSelectParser.java:79)
      at io.debezium.connector.oracle.logminer.processor.AbstractLogMinerEventProcessor.lambda$handleXmlBegin$7(AbstractLogMinerEventProcessor.java:908)
      at io.debezium.connector.oracle.logminer.processor.memory.MemoryLogMinerEventProcessor.addToTransaction(MemoryLogMinerEventProcessor.java:205)
      at io.debezium.connector.oracle.logminer.processor.AbstractLogMinerEventProcessor.handleXmlBegin(AbstractLogMinerEventProcessor.java:905)
      at io.debezium.connector.oracle.logminer.processor.AbstractLogMinerEventProcessor.processRow(AbstractLogMinerEventProcessor.java:372)
      at io.debezium.connector.oracle.logminer.processor.AbstractLogMinerEventProcessor.processResults(AbstractLogMinerEventProcessor.java:306)
      at io.debezium.connector.oracle.logminer.processor.AbstractLogMinerEventProcessor.process(AbstractLogMinerEventProcessor.java:234)
      at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:248)
      at io.debezium.connector.oracle.logminer.LogMinerStreamingChangeEventSource.execute(LogMinerStreamingChangeEventSource.java:62)
      at io.debezium.pipeline.ChangeEventSourceCoordinator.streamEvents(ChangeEventSourceCoordinator.java:272)
      at io.debezium.pipeline.ChangeEventSourceCoordinator.executeChangeEventSources(ChangeEventSourceCoordinator.java:195)
      at io.debezium.pipeline.ChangeEventSourceCoordinator.lambda$start$0(ChangeEventSourceCoordinator.java:138)
      at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
      at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
      at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
      at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
      at java.base/java.lang.Thread.run(Thread.java:829)
      Caused by: io.debezium.text.ParsingException: Failed to parse operator at index 310: XML DOC BEGIN:  select "PROPERTIES" from "SCHEMA"."TABLE" where "COLUMN_A" = '314107' and "COLUMN_B" = '69265' and "COLUMN_D" = '74' and "TIME_A" = TO_TIMESTAMP_TZ('2024-02-14 10:58:02.202590 +01:00') and "TIME_B" = TO_TIMESTAMP_TZ('3000-01-01 00:00:00.000000 +00:00') and "MODIFICATIONTIME" IS NULL
      at io.debezium.connector.oracle.logminer.parser.PreambleSingleColumnReconstructedSelectParser.parseOperator(PreambleSingleColumnReconstructedSelectParser.java:205)
      at io.debezium.connector.oracle.logminer.parser.PreambleSingleColumnReconstructedSelectParser.parseWhereClause(PreambleSingleColumnReconstructedSelectParser.java:158)
      at io.debezium.connector.oracle.logminer.parser.PreambleSingleColumnReconstructedSelectParser.parse(PreambleSingleColumnReconstructedSelectParser.java:73)
       

       

      Comments

      Compiling and running the `parseWhereClause` logic from `PreambleSingleColumnReconstructedSelectParser.parse`, I could determine that this error gets thrown when the `where` clause ends with `IS NULL`, with no trailing whitespace. In this cases, the column value for the filter in question is parsed as `NUL` and not `NULL`.
       
      It does NOT occur if:

      •  the where clause ends with a different filter e.g. 
      where "COLUMN_A" = '314107' and "COLUMN_B" = '69265' and "COLUMN_D" = '74' and "MODIFICATIONTIME" IS NULL and "TIME_A" = TO_TIMESTAMP_TZ('2024-02-14 10:58:02.202590 +01:00') and "TIME_B" = TO_TIMESTAMP_TZ('3000-01-01 00:00:00.000000 +00:00')
      • or if the query ends with a single space 
      where "COLUMN_A" = '314107' and "COLUMN_B" = '69265' and "COLUMN_D" = '74' and "TIME_A" = TO_TIMESTAMP_TZ('2024-02-14 10:58:02.202590 +01:00') and "TIME_B" = TO_TIMESTAMP_TZ('3000-01-01 00:00:00.000000 +00:00') and "MODIFICATIONTIME" IS NULL 

      in these cases, the column value for the `MODIFICATIONTIME` filter is correctly parsed as `NULL` and the operator parsing exception is not thrown

            ccranfor@redhat.com Chris Cranford
            naivedigit Theodore Evans
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: