Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-5812

Conflicting documentation for snapshot.mode property in MongoDB connector v2.0

      https://debezium.io/documentation/reference/stable/connectors/mongodb.html#mongodb-property-snapshot-mode
      says:

      The default is initial, and specifies that the connector reads a snapshot when either no offset is found or if the change stream no longer contains the previous offset.

      but, then in: https://debezium.io/documentation/reference/stable/connectors/mongodb.html#debezium-mongodb-connector-is-stopped-for-a-long-interval in the Note section it says:

      If the connector remains stopped for long enough, MongoDB might purge older oplog files and the connectorโ€™s last position may be lost. In this case, when the connector configured with initial snapshot mode (the default) is finally restarted, the MongoDB server will no longer have the starting point and the connector will fail with an error.{}

      So, not clear - when previous offset is not available, the connector will start a snapshot process or will fail with an error?

            [DBZ-5812] Conflicting documentation for snapshot.mode property in MongoDB connector v2.0

            Errata Tool added a comment -

            Since the problem described in this issue should be resolved in a recent advisory, it has been closed.

            For information on the advisory (Red Hat build of Debezium 2.1.4 release), and where to find the updated files, follow the link below.

            If the solution does not work for you, open a new bug report.
            https://access.redhat.com/errata/RHEA-2023:1814

            Errata Tool added a comment - Since the problem described in this issue should be resolved in a recent advisory, it has been closed. For information on the advisory (Red Hat build of Debezium 2.1.4 release), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2023:1814

            Released

            Debezium Builder added a comment - Released

            jpechane The description for snapshot.mode in the MongoDB doc states that if the value of the property is set to never, instead of running a snapshot, the connector tails the oplog.
            I assume that the connector then emits a read event for each transaction that the tail operation returns?

            Is there a set number of transactions that the tail operation captures? Or is that number configurable?
            Does the connector record the offset of the last transaction that it receives via a tail as it would from a snapshot?

            Robert Roldan added a comment - jpechane The description for snapshot.mode in the MongoDB doc states that if the value of the property is set to never , instead of running a snapshot, the connector tails the oplog. I assume that the connector then emits a read event for each transaction that the tail operation returns? Is there a set number of transactions that the tail operation captures? Or is that number configurable? Does the connector record the offset of the last transaction that it receives via a tail as it would from a snapshot?

            Yes, I can imagine this being covered via event.processing.failure.handling.mode. Yet still the issue is that without the snapshot there would be lost data.

            Jiri Pechanec added a comment - Yes, I can imagine this being covered via event.processing.failure.handling.mode . Yet still the issue is that without the snapshot there would be lost data.

            jpechane I'd also recommend considering to add an option to reset offsets for a specific connector via property.

            For example, please take a look how it is deployed in MongoDB Kafka Source Connector: https://www.mongodb.com/docs/kafka-connector/current/troubleshooting/recover-from-invalid-resume-token/#reset-stored-offsets

             

            Alon Prantsipal (Inactive) added a comment - jpechane I'd also recommend considering to add an option to reset offsets for a specific connector via property. For example, please take a look how it is deployed in MongoDB Kafka Source Connector: https://www.mongodb.com/docs/kafka-connector/current/troubleshooting/recover-from-invalid-resume-token/#reset-stored-offsets  

            broldan@redhat.com They should either create a connector with new a name - so the old offsets will not apply or they need to remove existing offsets. Generally changing the offsets is very brittle operation and it is not something I'd promote with standard documentation. Also the same applies to all connectors. If the transaction log/replication slot etc. is no longer available the connector will fail. In case of some connectors like MySQL there is a special snapshot mode when_needed that will automatically re-execute the snapshot if point of restart is no longer avaialable. We intend to extend the support for it to all connectors but it is not done yet.

            Jiri Pechanec added a comment - broldan@redhat.com They should either create a connector with new a name - so the old offsets will not apply or they need to remove existing offsets. Generally changing the offsets is very brittle operation and it is not something I'd promote with standard documentation. Also the same applies to all connectors. If the transaction log/replication slot etc. is no longer available the connector will fail. In case of some connectors like MySQL there is a special snapshot mode when_needed that will automatically re-execute the snapshot if point of restart is no longer avaialable. We intend to extend the support for it to all connectors but it is not done yet.

            jpechane So ... following a restart, a connector with snapshot.mode set to initial fails, because while the connector was offline, the offset value was purged from the oplog...

            What steps can a user take to recover?  

            Can they trigger a snapshot by restarting the connector again?  Or do they have to resort to something like changing the offsets in the source?

             

             

             

             

            Robert Roldan added a comment - jpechane So ... following a restart, a connector with snapshot.mode set to initial fails, because while the connector was offline, the offset value was purged from the oplog... What steps can a user take to recover?   Can they trigger a snapshot by restarting the connector again?  Or do they have to resort to something like changing the offsets in the source ?        

            alon_rh Hi, yes that is expected behaviour.

            Jiri Pechanec added a comment - alon_rh Hi, yes that is expected behaviour.

            jpechane So if I understand correctly from your comment, no auto-snapshot happens in case of losing offsets in MongoDB connector with snapshot.mode = initial.

            Please confirm.

            Alon Prantsipal (Inactive) added a comment - jpechane So if I understand correctly from your comment, no auto-snapshot happens in case of losing offsets in MongoDB connector with snapshot.mode = initial. Please confirm.

            broldan@redhat.com Could you please make the docs consistent. The former part describes quite an old behvaiour that was replaced with the latter one. Thanks

            Jiri Pechanec added a comment - broldan@redhat.com Could you please make the docs consistent. The former part describes quite an old behvaiour that was replaced with the latter one. Thanks

              broldan@redhat.com Robert Roldan
              alon_rh Alon Prantsipal (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: