Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-9370

Mining upper boundary is miscalculated when using archive log only mode

XMLWordPrintable

      When we mine data in near real-time with the online redo logs, we take the minimum flushed SCN across the redo threads as the enforced upper bounds to avoid advancing the read position too far for one redo thread.

      However, when log.mining.archive.log.only.mode is enabled, we do not enforce such a calculation. Therefore, there is the possibility that while reading archive log changes for one redo thread could advance the read position beyond a less active thread, leading to changes that could be lost.

      When we don't use archive log only mode, this is a non issue because the Oracle RAC flush connections should be causing an intermediate write change for each thread periodically, which guarantees that the SCN on each node is increasing and so the only disparity we normally see is from batch jobs on an active node, but the difference in flushed SCNs across threads is generally relatively close.

      This does not hold for archive logs. In a system where one node is more active than the other, the more active node can generate significantly more archive logs than the other over the same time window. Therefore, the disparity between what is the last flushed archive log SCN across threads can be massively different.

      The simplest and likely only solution here is to apply a similar MIN calculation like we do when we generate the upper bounds for non-archive log only mode from V$THREAD, but take the archive log list data set and compute the minimum NEXT_SCN grouped by redo thread.

      The downside to this approach with archive log mode is the potential disparity between more active and less active nodes.

      For example, let's assume we have two redo threads, T1 and T2 and we start the connector like new. At first there will be no changes outside of the snapshot while we wait for the next log switch, which we generated at the read position 2200. When T1 rolls to the archive, its log range is 2000 to 5000. We report the logs are inconsistent because we do not yet have a log for T2, so the connector continues to wait. Eventually T1 rolls again generating a log with the range 5000 to 8000 and T2 rolls with a log range of 1500 to 4500. At this point we're reading from position 2200, and T1 and T2 satisfy that and neither threads have a gap in their sequences.

      The problem here is that we add those 3 logs, read them and the generated next iteration SCN would be 8000; however T2 ended at 4500. There is most likely changes between 4500 and 8000 SCN, and so we can't emit the changes from T1 for that range because we need to consider chronological commit ordering.

      Ergo, we can only mine up to 4500 in this case, and we have to stop. If T2 remains dormant for an extended period of time before it rolls while T1 continues to generate more and more logs, that data in T1 gets delayed due to T2.

      One way to mitigate this at the database side is to enforce a hard roll of the redo logs on an interval so that if one redo thread lags behind, it does not penalize the other threads being streamed for more than the roll interval, but I don't see any solution from Debezium's side.

              ccranfor@redhat.com Chris Cranford
              ccranfor@redhat.com Chris Cranford
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: