-
Bug
-
Resolution: Done
-
Major
-
3.2.0.Final
-
None
-
False
-
-
False
Bug report
When trying to capture a large CLOB (by association, BLOB) value (2GB, in my case) in Oracle connector, Debezium fails with an IO exception caused by Oracle JDBC driver almost immediately.
1GB data is captured normally.
What Debezium connector do you use and what version?
Oracle connector 3.2.0.Final
What is the connector configuration?
Default
What is the captured database version and mode of deployment?
(E.g. on-premises, with a specific cloud provider, etc.)
RDS Oracle 19.0.0.0.ru-2024-04.rur-2024-04.r1 on db.t3.large node (2 vCPUs, 8 GB RAM)
Debezium in embedded mode (non-Kafka).
What behavior do you expect?
2GB CLOB column value is captured (loaded) into memory.
What behavior do you see?
An exception thrown upon reaching the database row containing 2GB CLOB value. A row with 1GB value is captured normally.
2025-08-26T03:46:27.510Z ERROR [pool-10-thread-1] i.d.u.Loggings: Failed to properly convert data value for 'ORCL.BLOBS.EMPLOYEES_GB.PHOTO' of type CLOB io.debezium.DebeziumException: Couldn't convert value for column PHOTO at io.debezium.connector.oracle.OracleValueConverters.convertString(OracleValueConverters.java:289) at io.debezium.connector.oracle.OracleValueConverters.lambda$converter$0(OracleValueConverters.java:209) at io.debezium.relational.TableSchemaBuilder.lambda$createValueGenerator$5(TableSchemaBuilder.java:327) at io.debezium.relational.TableSchema.valueFromColumnData(TableSchema.java:142) at io.debezium.relational.RelationalChangeRecordEmitter.emitReadRecord(RelationalChangeRecordEmitter.java:87) at io.debezium.relational.RelationalChangeRecordEmitter.emitChangeRecords(RelationalChangeRecordEmitter.java:50) at io.debezium.pipeline.EventDispatcher.dispatchSnapshotEvent(EventDispatcher.java:227) at io.debezium.relational.RelationalSnapshotChangeEventSource.doCreateDataEventsForTable(RelationalSnapshotChangeEventSource.java:655) at io.debezium.connector.oracle.OracleSnapshotChangeEventSource.lambda$createDataEventsForTableCallable$1(OracleSnapshotChangeEventSource.java:295) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) Caused by: java.sql.SQLRecoverableException: ORA-17008: Closed connection https://docs.oracle.com/error-help/db/ora-17008/ at oracle.jdbc.driver.OracleClob.getDBAccess(OracleClob.java:1799) at oracle.jdbc.driver.OracleClob.getSubString(OracleClob.java:465) at oracle.sql.CLOB.getSubString(CLOB.java:431) at io.debezium.connector.oracle.OracleValueConverters.convertString(OracleValueConverters.java:286) ... 14 common frames omitted
Do you see the same behaviour using the latest released Debezium version?
(Ideally, also verify with latest Alpha/Beta/CR version)
yes
Do you have the connector logs, ideally from start till finish?
(You might be asked later to provide DEBUG/TRACE level log)
no
How to reproduce the issue using our tutorial deployment?
It is as simple as creating the simplest table with two columns - ID and PHOTO (for example) and inserting a single row with 2GB of data. The last bit is challenging (takes approx. 4hrs to generate the data in our RDS instance). Here is the setup SQL script:
CREATE TABLE EMPLOYEES_GB (ID INT PRIMARY KEY NOT NULL, PHOTO CLOB); DECLARE vbuf CLOB; cbuf CLOB; BEGIN dbms_lob.createtemporary(cbuf, FALSE); SELECT PHOTO INTO vbuf FROM EMPLOYEES_GB WHERE ID = 1; FOR i IN 1..2 LOOP dbms_lob.append(cbuf, vbuf); END LOOP; INSERT INTO EMPLOYEES_GB (ID, PHOTO) VALUES (2, cbuf); END; COMMIT;
Engineering notes
Upon closer investigation, I have noticed the way CLOB / BLOB data is retrieved might be flawed: the code that reads the bytes uses integer coercion from a long value:
for CLOBs:
Clob clob = (Clob) data; // Note that java.sql.Clob specifies that the first character starts at 1 // and that length must be greater-than or equal to 0. So for an empty // clob field, a call to getSubString(1, 0) is perfectly valid. return clob.getSubString(1, (int) clob.length());
for BLOBs:
Blob blob = (Blob) data;
data = blob.getBytes(1, Long.valueOf(blob.length()).intValue());
note the coercion resulting in a data truncation:
Long.valueOf(blob.length()).intValue()
If a buffered/streamed reading approach is used instead, the data is read normally.
Going to raise a PR on Github.