Loading...

Type: Bug
Resolution: Done
Priority: Major
Fix Version/s: 2.6.0.CR1
Affects Version/s: 2.4.0.Final, 2.5.0.Final, 2.6.0.Alpha1
Component/s: oracle-connector
Labels:
None

Blocked:
False
Blocked Reason:
None
Ready:
False
Git Pull Request:
https://github.com/debezium/debezium/pull/5398

Severity:
Critical

SFDC Cases Counter:
SFDC Cases Open:
SFDC Cases Links:

What Debezium connector do you use and what version?

Debezium Server's Oracle Connector (nightly - after PR #5162 had been merged)

What is the connector configuration?

debezium.sink.type=pubsub
debezium.sink.pubsub.project.id=bee-data-ingestion
debezium.sink.pubsub.ordering.enabled=false

debezium.source.connector.class=io.debezium.connector.oracle.OracleConnector
debezium.source.snapshot.mode=schema_only
debezium.source.topic.prefix=oracle-planck-ingestion
debezium.source.tombstones.on.delete=false

debezium.source.log.mining.strategy=redo_log_catalog
debezium.source.log.mining.batch.size.min=20000
debezium.source.log.mining.batch.size.max=500000
debezium.source.log.mining.sleep.time.default.ms=600
debezium.source.log.mining.transaction.retention.ms=79200000
debezium.source.query.fetch.size=20000

debezium.source.offset.storage=io.debezium.storage.redis.offset.RedisOffsetBackingStore
debezium.source.offset.storage.redis.address=wl-data-integration-npf-redis.metaplane.cloud:6379
debezium.source.offset.storage.redis.password=${REDIS_PASSWORD}
debezium.source.offset.storage.redis.key=database-ingestion:oracle-planck-cdc:debezium-server:offset
debezium.source.offset.flush.interval.ms=30000

debezium.source.schema.history.internal.store.only.captured.tables.ddl=true
debezium.source.schema.history.internal=io.debezium.storage.redis.history.RedisSchemaHistory
debezium.source.schema.history.internal.redis.address=wl-data-integration-npf-redis.metaplane.cloud:6379
debezium.source.schema.history.internal.redis.password=${REDIS_PASSWORD}
debezium.source.schema.history.internal.redis.key=database-ingestion:oracle-planck-cdc:debezium-server:schema_history

debezium.source.decimal.handling.mode=string
debezium.source.key.converter.schemas.enable=false
debezium.source.value.converter.schemas.enable=false

debezium.transforms.Reroute.type=io.debezium.transforms.ByLogicalTableRouter
debezium.transforms.Reroute.topic.regex=.*
debezium.transforms.Reroute.topic.replacement=oracle-planck-ingestion
debezium.transforms=Reroute

quarkus.http.port=8080
quarkus.log.level=DEBUG

debezium.source.database.hostname=dbpkpr-scan.back.b2w
debezium.source.database.port=1521
debezium.source.database.dbname=SRV_OGG_PLK
debezium.source.table.include.list=# (list of 132 tables)

What is the captured database version and mode of depoyment?

(E.g. on-premises, with a specific cloud provider, etc.)

Oracle RAC 19c

What behaviour do you expect?

When the offset contains an SCN value that is available in the archived logs (that is, in a file not yet deleted from the current primary instance), the connector should be able to locate it and start a mining session from that point.

What behaviour do you see?

In ~~DBZ-7345~~ I've mentioned a few scenarios which seemed to trigger an Oracle connector's failure on locating the offset SCN. The logs for one of those cases (a database version upgrade) pointed to an issue with the removal of duplicate multi-thread sequences, which was promptly solved by ccranfor@redhat.com's PR #5162.

However, while using a nightly image from after the fix merge, we faced another instance of the same symptom: the "None of log files contains offset SCN" error. After a few of our databases were affected by an Exadata Cloud Infrastructure maintenance update, the Oracle connectors linked to them failed with this same error, even though those databases still presented the log files that contained the respective SCNs and no duplicate sequences were involved.

Again, it was only possible to avoid the error by manually replacing the offset SCNs. For one of the databases it was even possible to restart reading from a previous SCN with no new interruptions. The logs from this specific case can be found in the appended files (before and after offset repositioning).

Do you see the same behaviour using the latest relesead Debezium version?

(Ideally, also verify with latest Alpha/Beta/CR version)

Could not verify, since the issue happened between #5162's merge and v2.6.0.Alpha1's release and it's not easily reproducible. But I believe none of the more recent PRs have fixed this as they seem not to be related.

Do you have the connector logs, ideally from start till finish?

(You might be asked later to provide DEBUG/TRACE level log)

Yes, please find them attached as
"planck_logs_nightly_scn_10423209952132_fail.txt", from when the error happened and
"planck_logs_nightly_scn_10423209745803_success.txt", after offset SCN was manually moved back.

On the first one, despite the error, the file that contains offset SCN (10423209952132) was selected to be added, as we can see in the following line:

2024-01-21 12:54:10,539 DEBUG [io.deb.con.ora.log.LogMinerHelper] (debezium-oracleconnector-oracle-planck-ingestion-change-event-source-coordinator) Archive log +RECOC1/DB8PLKPR_65N_VCP/ARCHIVELOG/2024_01_21/thread_4_seq_65948.49404.1158802907 with SCN range 10423209919749 to 10423209995534 sequence 65948 to be added.

Differing from the previous issue with inaccurate duplicate removals, the sequence 65948 was not duplicated and no log files seem to be removed.

However, after restarting from a previous SCN (10423209745803), the exact same file ended up being added and the error did not occur.

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

mastersaff_archivelogs.xlsx
37 kB
2024/02/01 2:11 PM
mastersaff_logs_fail.txt
1.89 MB
2024/02/01 2:11 PM
planck_logs_nightly_scn_10423209745803_success.txt
10.83 MB
2024/01/24 11:18 PM
planck_logs_nightly_scn_10423209952132_fail.txt
4.07 MB
2024/01/24 11:18 PM
umbrella_threads_view.csv
2 kB
2024/02/06 3:11 PM

is duplicated by

DBZ-7349 Oracle Connector Data Loss Issue

Closed

is related to

DBZ-7158 Log sequence check should treat each redo thread independently

Closed

DBZ-7345 Oracle connector is ocasionally unable to find SCN

Closed

DBZ-7546 None of log files contains offset SCN Error

Closed

links to

RHEA-2024:139598 Red Hat build of Debezium 2.5.4 release

Details

Description

What Debezium connector do you use and what version?

What is the connector configuration?

What is the captured database version and mode of depoyment?

What behaviour do you expect?

What behaviour do you see?

Do you see the same behaviour using the latest relesead Debezium version?

Do you have the connector logs, ideally from start till finish?

Attachments

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates