-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
None
-
None
-
False
-
None
-
False
Issue with Debezium Oracle CDC Connector. Oracle 19c is running as container DB and contains multiple PDBs
Bug report
When the Connector is running then no issue is found. However in case when POD on which the connector is running is deleted and created, there is huge lag is reported. No data is send to the Kafka topic but the connector again tries to process already processed SCN received by querying LogMiner package
Similar behavior of huge lag is experienced in all the other connectors if a large table is loaded in one database in a PDB. The lag is effecting other connectors running on tables in different PDB
What Debezium connector do you use and what version?
1.9.3
What is the connector configuration?
"name": "extract_raw_data_to_message",
"config": {
"database.server.name": "extract_raw_data_to_message",
"heartbeat.topics.prefix": "extract_raw_data_to_message.heartbeat",
"database.history.kafka.topic": "extract_raw_data_to_message.history-topic",
"transaction.topic": "extract_raw_data_to_message.transaction",
"transforms.datastreamtopic.regex" : "extract_raw_data_to_message.DBxxxx.(?!DBZ_HEARTBEAT$)(.+)$",
"transforms.datastreamtopic.replacement" : "cdc_datastream_topic.$1",
"transforms": "datastreamtopic",
"transforms.datastreamtopic.type" : "org.apache.kafka.connect.transforms.RegexRouter",
"log.mining.session.max.ms" : "14400000",
"connector.class": "io.debezium.connector.oracle.OracleConnector",
"database.dbname": "DBxxxx",
"database.user": "c##xxxx",
"database.password": "xxxx",
"database.connection.adapter": "logminer",
"database.url": "jdbc:oracle:thin:@(DESCRIPTION = (ADDRESS = (PROTOCOL = TCP)(HOST = xxxx)(PORT = xxxx)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = xxxx)))",
"database.pdb.name": "xxxx",
"database.tablename.case.insensitive": "true",
"snapshot.mode": "schema_only",
"snapshot.locking.mode": "none",
"snapshot.fetch.size": "20000",
"snapshot.delay.ms": "2000",
"topic.creation.default.replication.factor": "3",
"topic.creation.default.partitions": "3",
"topic.creation.enable": "true",
"event.processing.failure.handling.mode": "warn",
"tasks.max": "1",
"sanitize.field.names": "true",
"heartbeat.interval.ms": "10000",
"include.schema.changes": "true",
"max.batch.size": "20480",
"max.queue.size": "81290",
"decimal.handling.mode": "double",
"time.precision.mode": "connect",
"binary.handling.mode": "base64",
"log.mining.strategy": "online_catalog",
"log.mining.scn.gap.detection.gap.size.min": "100000",
"log.mining.scn.gap.detection.time.interval.max.ms": "3000",
"log.mining.view.fetch.size": "10000",
"log.mining.batch.size.default": "20000",
"log.mining.batch.size.max": "25000",
"log.mining.batch.size.min": "1000",
"log.mining.log.query.max.retries": "200",
"database.history.store.only.captured.tables.ddl": "true",
"database.history.kafka.recovery.poll.interval.ms": "100000",
"database.history.kafka.query.timeout.ms": "30000",
"database.history.kafka.bootstrap.servers": "${env:KAFKA_CONNECT_BOOTSTRAP_SERVERS}",
"database.history.consumer.security.protocol": "${env:APMT_KAFKA_CONNECT_SECURITY_PROTOCOL}",
"database.history.producer.security.protocol": "${env:APMT_KAFKA_CONNECT_SECURITY_PROTOCOL}",
"database.history.consumer.sasl.mechanism": "${env:APMT_KAFKA_SASL_MECHANISM}",
"database.history.producer.sasl.mechanism": "${env:APMT_KAFKA_SASL_MECHANISM}",
"database.history.consumer.sasl.jaas.config": "${env:APMT_KAFKA_JAAS_CONFIG}",
"database.history.producer.sasl.jaas.config": "${env:APMT_KAFKA_JAAS_CONFIG}",
"database.history.consumer.ssl.endpoint.identification.algorithm": "https",
"database.history.producer.ssl.endpoint.identification.algorithm": "https",
"table.include.list": "TABLE1,TABLE2,TABLE3"
}
What is the captured database version and mode of depoyment?
Oracle 19c Installed on Azure VM.
Its a Container DB having multiple PDBs running on a CDB
We are using default inMemory cache configuration
What behaviour do you expect?
We expect that when a POD is restarted, it should start processing from the last processed SCN. It should try processing all previously processed data
What behaviour do you see?
Log :
2022-08-01 15:41:52,739 INFO The connector is now using the maximum batch size 25000 when querying the LogMiner view. This could be indicative of large SCN gaps (io.debezium.connector.oracle.OracleStreamingChangeEventSourceMetrics) [debezium-oracleconnector-extract_raw_data_to_message-change-event-source-coordinator]
--------------------------------------
2022-08-01 14:04:56,913 WARN Event for transaction 170021003d180000 has already been processed, skipped. (io.debezium.connector.oracle.logminer.processor.memory.MemoryLogMinerEventProcessor) [debezium-oracleconnector-extract_raw_data_to_message-change-event-source-coordinator]
2022-08-01 14:04:56,913 WARN Event for transaction 170021003d180000 has already been processed, skipped. (io.debezium.connector.oracle.logminer.processor.memory.MemoryLogMinerEventProcessor) [debezium-oracleconnector-extract_raw_data_to_message-change-event-source-coordinator]
Do you see the same behaviour using the latest relesead Debezium version?
(Ideally, also verify with latest Alpha/Beta/CR version)
We are using 1.9.3 version
Do you have the connector logs, ideally from start till finish?
Log is attached above.
Connector running in INFO mode
How to reproduce the issue using our tutorial deployment?
<Your answer>
Feature request or enhancement
Issue Fix
Which use case/requirement will be addressed by the proposed feature?
Performance issue
Implementation ideas (optional)
<Your answer>
- duplicates
-
DBZ-5140 No raising of "WARN Event for transaction X has already been processed, skipped."
- Closed