Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-7278

BSONObjectTooLarge even with the cursor.pipeline config

    XMLWordPrintable

Details

    • False
    • None
    • False
    • Important

    Description

      The issue is marked as a bug, but it could be a case of misconfiguration on our part.

      Bug report

      We have a MongoDB connector that contains some documents which cause the BSONObjectTooLarge error during normal operation of Debezium. As per documentation for the most recent verson, we tried adding the required fields to the config to try and skip these messages, however the error still persists. It is possible that we have misconfigured some fields in the connector config, as it looks like this configuration is ignored.

      What Debezium connector do you use and what version?

      Kubernetes Deployment based on the debezium/connect:2.4.1.Final image

      What is the connector configuration?

       
       

      { "name": "cdc_cdc_rds_mongodb_endpoint_n_0", "config": { "connector.class": "io.debezium.connector.mongodb.MongoDbConnector", "collection.include.list": "^rds\\.item_data$", "max.queue.size": "8192", "signal.consumer.sasl.mechanism": "SCRAM-SHA-512", "mongodb.connection.string": "<redacted-valid-connection-string>", "mongodb.password": "<redacted>", "tasks.max": "1", "transforms": "route", "next.restart": "-1", "topic.heartbeat.prefix": "cdc-heartbeat", "cursor.oversize.skip.threshold": "16777216", "transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter", "topic.prefix": "cdc_cdc_rds_mongodb_endpoint_n_0", "transforms.route.regex": "([^\\.]*)\\.([^\\.]*)\\.(.*)", "mongodb.authsource": "rds", "transforms.route.replacement": "cdc-data-$2-$3", "signal.consumer.group.id": "cdc-cdc-rds-mongodb-endpoint-n-0", "cursor.pipeline": "[{ '$match': { '$and': [{'$expr': { '$lte': [{'$bsonSize': '$fullDocument'}, 16777216]}}, {'$expr': { '$lte': [{'$bsonSize': '$fullDocumentBeforeChange'}, 16777216]}}]} }]", "signal.kafka.bootstrap.servers": "<redacted>", "cursor.pipeline.order": "user_first", "signal.kafka.topic": "cdc-signal.cdc-cdc-rds-mongodb-endpoint-n-0", "mongodb.user": "navarch", "heartbeat.interval.ms": "2000", "mongodb.name": "cdc_cdc_rds_mongodb_endpoint_n_0", "version": "eae7bab8d0e7efa05bad682b78be50cdd7b2b8731b580382ffeff0ba4583d28f2b178ae0bd9393672c937cb730c0d747d24dfe5fc9dbfd496f99afb95784040b", "signal.consumer.sasl.jaas.config": "org.apache.kafka.common.security.scram.ScramLoginModule required username=\"cdc\" password=\"<redacted>\";", "cursor.oversize.handling.mode": "skip", "name": "cdc_cdc_rds_mongodb_endpoint_n_0", "max.batch.size": "2048", "signal.consumer.security.protocol": "SASL_SSL", "snapshot.mode": "never", "connect.timeout.ms": "120000", "signal.kafka.poll.timeout.ms": "5000" }, "tasks": [ { "connector": "cdc_cdc_rds_mongodb_endpoint_n_0", "task": 0 } ], "type": "source" }

       

      What is the captured database version and mode of depoyment?

      On-premise MongoDB 6.0

      What behaviour do you expect?

      The oversized messages should simply be skipped.

      What behaviour do you see?

      The connector throws an exception with the following messages
       

      Caused by: com.mongodb.MongoCommandException: Command failed with error 10334 (BSONObjectTooLarge): 'PlanExecutor error during aggregation :: caused by :: BSONObj size: 21648261 (0x14A5385) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: "8265807427000000992B022C0100296E5A1004C640ECE10E3449F68DEB7B0962474E6246645F69640064655CB1169B71D38AFF5D6C650004" }' on server mongo-rds-node-1.c.prod-persistence-real.internal:27017. The full response is {"ok": 0.0, "errmsg": "PlanExecutor error during aggregation :: caused by :: BSONObj size: 21648261 (0x14A5385) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: \"8265807427000000992B022C0100296E5A1004C640ECE10E3449F68DEB7B0962474E6246645F69640064655CB1169B71D38AFF5D6C650004\" }", "code": 10334, "codeName": "BSONObjectTooLarge", "$clusterTime": {"clusterTime": {"$timestamp": {"t": 1702924868, "i": 276}}, "signature": {"hash": {"$binary": {"base64": "KmlhjKxAaFXl8eeYHohBcTx+qMw=", "subType": "00"}}, "keyId": 7257082665252159808}}, "operationTime": {"$timestamp": {"t": 1702924868, "i": 275}}} 
      com.mongodb.MongoCommandException: Command failed with error 10334 (BSONObjectTooLarge): 'PlanExecutor error during aggregation :: caused by :: BSONObj size: 21648261 (0x14A5385) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: "8265807427000000992B022C0100296E5A1004C640ECE10E3449F68DEB7B0962474E6246645F69640064655CB1169B71D38AFF5D6C650004" }' on server <redacted>. The full response is {"ok": 0.0, "errmsg": "PlanExecutor error during aggregation :: caused by :: BSONObj size: 21648261 (0x14A5385) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: \"8265807427000000992B022C0100296E5A1004C640ECE10E3449F68DEB7B0962474E6246645F69640064655CB1169B71D38AFF5D6C650004\" }", "code": 10334, "codeName": "BSONObjectTooLarge", "$clusterTime": {"clusterTime": {"$timestamp": {"t": 1702924868, "i": 276}}, "signature": {"hash": {"$binary": {"base64": "KmlhjKxAaFXl8eeYHohBcTx+qMw=", "subType": "00"}}, "keyId": 7257082665252159808}}, "operationTime": {"$timestamp": {"t": 1702924868, "i": 275}}}

       

      Do you see the same behaviour using the latest relesead Debezium version?

      (Ideally, also verify with latest Alpha/Beta/CR version)

      Yes.

      Do you have the connector logs, ideally from start till finish?

      As this is a production service, we cannot trigger the issue at will. The oversized messages appear intermittently. The current course of action is to clear the worker topics, thereby forcing the connector to start from the current position and skip the offending message. We have the logs from the previous incidents, but they are on the INFO level, not DEBUG/TRACE.

      How to reproduce the issue using our tutorial deployment?

      We think that the connector setup itself contains no special elements.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            ognjen-j Ognjen Joldzic (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: