Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-9392

Oracle connector does not support large CLOB and BLOB values

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 3.3.0.Beta1
    • 3.2.0.Final
    • oracle-connector
    • None

      Bug report

      When trying to capture a large CLOB (by association, BLOB) value (2GB, in my case) in Oracle connector, Debezium fails with an IO exception caused by Oracle JDBC driver almost immediately.

      1GB data is captured normally.

      What Debezium connector do you use and what version?

      Oracle connector 3.2.0.Final

      What is the connector configuration?

      Default

      What is the captured database version and mode of deployment?

      (E.g. on-premises, with a specific cloud provider, etc.)

      RDS Oracle 19.0.0.0.ru-2024-04.rur-2024-04.r1 on db.t3.large node (2 vCPUs, 8 GB RAM)

      Debezium in embedded mode (non-Kafka).

      What behavior do you expect?

      2GB CLOB column value is captured (loaded) into memory.

      What behavior do you see?

      An exception thrown upon reaching the database row containing 2GB CLOB value. A row with 1GB value is captured normally.

      2025-08-26T03:46:27.510Z ERROR [pool-10-thread-1] i.d.u.Loggings: Failed to properly convert data value for 'ORCL.BLOBS.EMPLOYEES_GB.PHOTO' of type CLOB
      io.debezium.DebeziumException: Couldn't convert value for column PHOTO
              at io.debezium.connector.oracle.OracleValueConverters.convertString(OracleValueConverters.java:289)
              at io.debezium.connector.oracle.OracleValueConverters.lambda$converter$0(OracleValueConverters.java:209)
              at io.debezium.relational.TableSchemaBuilder.lambda$createValueGenerator$5(TableSchemaBuilder.java:327)
              at io.debezium.relational.TableSchema.valueFromColumnData(TableSchema.java:142)
              at io.debezium.relational.RelationalChangeRecordEmitter.emitReadRecord(RelationalChangeRecordEmitter.java:87)
              at io.debezium.relational.RelationalChangeRecordEmitter.emitChangeRecords(RelationalChangeRecordEmitter.java:50)
              at io.debezium.pipeline.EventDispatcher.dispatchSnapshotEvent(EventDispatcher.java:227)
              at io.debezium.relational.RelationalSnapshotChangeEventSource.doCreateDataEventsForTable(RelationalSnapshotChangeEventSource.java:655)
              at io.debezium.connector.oracle.OracleSnapshotChangeEventSource.lambda$createDataEventsForTableCallable$1(OracleSnapshotChangeEventSource.java:295)
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
              at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
              at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
              at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
              at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
              at java.base/java.lang.Thread.run(Thread.java:840)
      Caused by: java.sql.SQLRecoverableException: ORA-17008: Closed connection
      https://docs.oracle.com/error-help/db/ora-17008/
              at oracle.jdbc.driver.OracleClob.getDBAccess(OracleClob.java:1799)
              at oracle.jdbc.driver.OracleClob.getSubString(OracleClob.java:465)
              at oracle.sql.CLOB.getSubString(CLOB.java:431)
              at io.debezium.connector.oracle.OracleValueConverters.convertString(OracleValueConverters.java:286)
              ... 14 common frames omitted
      

      Do you see the same behaviour using the latest released Debezium version?

      (Ideally, also verify with latest Alpha/Beta/CR version)

      yes

      Do you have the connector logs, ideally from start till finish?

      (You might be asked later to provide DEBUG/TRACE level log)

      no

      How to reproduce the issue using our tutorial deployment?

      It is as simple as creating the simplest table with two columns - ID and PHOTO (for example) and inserting a single row with 2GB of data. The last bit is challenging (takes approx. 4hrs to generate the data in our RDS instance). Here is the setup SQL script:

      CREATE TABLE EMPLOYEES_GB (ID INT PRIMARY KEY NOT NULL, PHOTO CLOB);
      
      DECLARE
          vbuf CLOB;
          cbuf CLOB;
      BEGIN
          dbms_lob.createtemporary(cbuf, FALSE);
          SELECT PHOTO INTO vbuf FROM EMPLOYEES_GB WHERE ID = 1;
      
          FOR i IN 1..2 LOOP
              dbms_lob.append(cbuf, vbuf);
          END LOOP;
      
          INSERT INTO EMPLOYEES_GB (ID, PHOTO) VALUES (2, cbuf);
      END;
      
      COMMIT;
      

      Engineering notes

      Upon closer investigation, I have noticed the way CLOB / BLOB data is retrieved might be flawed: the code that reads the bytes uses integer coercion from a long value:

      for CLOBs:

      Clob clob = (Clob) data;
      // Note that java.sql.Clob specifies that the first character starts at 1
      // and that length must be greater-than or equal to 0. So for an empty
      // clob field, a call to getSubString(1, 0) is perfectly valid.
      return clob.getSubString(1, (int) clob.length());
      

      for BLOBs:

      Blob blob = (Blob) data;
      data = blob.getBytes(1, Long.valueOf(blob.length()).intValue());
      

      note the coercion resulting in a data truncation:

      Long.valueOf(blob.length()).intValue()
      

      If a buffered/streamed reading approach is used instead, the data is read normally.
      Going to raise a PR on Github.

              Unassigned Unassigned
              artem.shubovych Artem Shubovych (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: