-
Bug
-
Resolution: Done
-
Critical
-
1.9.6.Final, 2.3.4.Final, 2.4.0.Final
-
None
Description:
For Oracle clob columns that have over 2,000 characters, single quotes are being replicated and cause characters at the end of the first kilobyte of data to be lost. This issue is similar to DBZ-4891.
Our DBA investigated the issue by looking at the logminer logs. On clob columns with a large amount of data, the column is being split into 1kb chunks. In those chunks, you can see that the single quote is duplicated. If you have one single quote within that data, you will lose the last character of that 1kb chunk when it puts the data back together again. You will lose one character for every single quote that is duplicated.
To reproduce the issue, update a clob column to have more than 2,000 characters worth of data and have a single quote within the first kb of data. This can be done via an update or an insert.
Snapshots work correctly. If you run incremental snapshots for the record with the duplicated single quote and lost character, the message the snapshot creates will have the correct data.
I tested with a column that had1,900 characters and everything worked correctly. Everything over 2,000 characters encountered this issue.
What Debezium connector do you use and what version?
Encountered on 1.9.6. Upgraded to 2.3.4 and still an issue.
What is the connector configuration?
"connector.class": "io.debezium.connector.oracle.OracleConnector",
"message.key.columns": "",
"tasks.max": "1",
"schema.history.internal.consumer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"\" password=\"\";",
"schema.history.internal.kafka.topic": "",
"snapshot.delay.ms": "1",
"schema.history.internal.consumer.security.protocol": "SASL_SSL",
"schema.history.internal.kafka.recovery.attempts": "10",
"log.mining.strategy": "online_catalog",
"database.server.timezone": "GMT",
"tombstones.on.delete": "false",
"schema.history.internal.consumer.ssl.endpoint.identification.algorithm": "https",
"decimal.handling.mode": "precise",
"database.schema": "",
"log.mining.continuous.mine": "false",
"schema.history.internal.kafka.recovery.poll.interval.ms": "30000",
"poll.interval.ms": "10000",
"schema.history.internal.skip.unparseable.ddl": "true",
"lob.enabled": "true",
"database.history.store.only.captured.tables.ddl": "true",
"schema.history.internal.producer.sasl.mechanism": "PLAIN",
"schema.history.internal.producer.sasl.jaas.config": "org.apache.kafka.common.security.plain.PlainLoginModule required username=\"\" password=\"\";",
"database.user": "",
"database.dbname": "orcl",
"schema.history.internal.producer.ssl.endpoint.identification.algorithm": "https",
"log.mining.batch.size.max": "200000",
"schema.history.internal.producer.security.protocol": "SASL_SSL",
"database.connection.adapter": "logminer",
"schema.history.internal.kafka.bootstrap.servers": "",
"time.precision.mode": "connect",
"topic.prefix": "",
"query.fetch.size": "20000",
"internal.log.mining.dml.parser": "legacy",
"database.port": "1521",
"column.exclude.list": "",
"database.hostname": "",
"database.password": "",
"log.mining.batch.size.min": "10000",
"log.mining.batch.size.default": "50000",
"name": "",
"table.include.list": "",
"schema.history.internal.consumer.sasl.mechanism": "PLAIN"
What is the captured database version and mode of depoyment?
AWS - Oracle Database 19c Standard Edition 2 Release 19.0.0.0.0 - Production
- links to
-
RHBA-2024:126962 Red Hat build of Debezium 2.3.7 release