Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-1247

Ability to specify batch size during snapshot

    XMLWordPrintable

Details

    Description

      io.debezium.connector.mongodb.Replicator class doesn't use batch size to limit memory allocation of MongoDB cursor:

      try (MongoCursor<Document> cursor = docCollection.find().iterator()) {
          while (running.get() && cursor.hasNext()) {
              Document doc = cursor.next();
              logger.trace("Found existing doc in {}: {}", collectionId, doc);
              counter += factory.recordObject(collectionId, doc, timestamp);
          }
      }
      

      MongoDB server chooses an appropriate batch size if the size isn't specified.

      I propose to add the following option:

      Property Default Description
      documents.fetch.size 0 Positive integer value that specifies the maximum number of documents that should be read in one go from each collection while taking a snapshot. The connector will read the collection contents in multiple batches of this size. Default to 0, which indicates that the server chooses an appropriate fetch size.

      Attachments

        Activity

          People

            andrey.pustovetov@gmail.com Andrey Pustovetov (Inactive)
            andrey.pustovetov@gmail.com Andrey Pustovetov (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: