-
Bug
-
Resolution: Obsolete
-
Major
-
None
-
2.4.1.Final
-
False
-
None
-
False
-
Important
The issue is marked as a bug, but it could be a case of misconfiguration on our part.
Bug report
We have a MongoDB connector that contains some documents which cause the BSONObjectTooLarge error during normal operation of Debezium. As per documentation for the most recent verson, we tried adding the required fields to the config to try and skip these messages, however the error still persists. It is possible that we have misconfigured some fields in the connector config, as it looks like this configuration is ignored.
What Debezium connector do you use and what version?
Kubernetes Deployment based on the debezium/connect:2.4.1.Final image
What is the connector configuration?
{ "name": "cdc_cdc_rds_mongodb_endpoint_n_0", "config": { "connector.class": "io.debezium.connector.mongodb.MongoDbConnector", "collection.include.list": "^rds\\.item_data$", "max.queue.size": "8192", "signal.consumer.sasl.mechanism": "SCRAM-SHA-512", "mongodb.connection.string": "<redacted-valid-connection-string>", "mongodb.password": "<redacted>", "tasks.max": "1", "transforms": "route", "next.restart": "-1", "topic.heartbeat.prefix": "cdc-heartbeat", "cursor.oversize.skip.threshold": "16777216", "transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter", "topic.prefix": "cdc_cdc_rds_mongodb_endpoint_n_0", "transforms.route.regex": "([^\\.]*)\\.([^\\.]*)\\.(.*)", "mongodb.authsource": "rds", "transforms.route.replacement": "cdc-data-$2-$3", "signal.consumer.group.id": "cdc-cdc-rds-mongodb-endpoint-n-0", "cursor.pipeline": "[{ '$match': { '$and': [{'$expr': { '$lte': [{'$bsonSize': '$fullDocument'}, 16777216]}}, {'$expr': { '$lte': [{'$bsonSize': '$fullDocumentBeforeChange'}, 16777216]}}]} }]", "signal.kafka.bootstrap.servers": "<redacted>", "cursor.pipeline.order": "user_first", "signal.kafka.topic": "cdc-signal.cdc-cdc-rds-mongodb-endpoint-n-0", "mongodb.user": "navarch", "heartbeat.interval.ms": "2000", "mongodb.name": "cdc_cdc_rds_mongodb_endpoint_n_0", "version": "eae7bab8d0e7efa05bad682b78be50cdd7b2b8731b580382ffeff0ba4583d28f2b178ae0bd9393672c937cb730c0d747d24dfe5fc9dbfd496f99afb95784040b", "signal.consumer.sasl.jaas.config": "org.apache.kafka.common.security.scram.ScramLoginModule required username=\"cdc\" password=\"<redacted>\";", "cursor.oversize.handling.mode": "skip", "name": "cdc_cdc_rds_mongodb_endpoint_n_0", "max.batch.size": "2048", "signal.consumer.security.protocol": "SASL_SSL", "snapshot.mode": "never", "connect.timeout.ms": "120000", "signal.kafka.poll.timeout.ms": "5000" }, "tasks": [ { "connector": "cdc_cdc_rds_mongodb_endpoint_n_0", "task": 0 } ], "type": "source" }
What is the captured database version and mode of depoyment?
On-premise MongoDB 6.0
What behaviour do you expect?
The oversized messages should simply be skipped.
What behaviour do you see?
The connector throws an exception with the following messages
Caused by: com.mongodb.MongoCommandException: Command failed with error 10334 (BSONObjectTooLarge): 'PlanExecutor error during aggregation :: caused by :: BSONObj size: 21648261 (0x14A5385) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: "8265807427000000992B022C0100296E5A1004C640ECE10E3449F68DEB7B0962474E6246645F69640064655CB1169B71D38AFF5D6C650004" }' on server mongo-rds-node-1.c.prod-persistence-real.internal:27017. The full response is {"ok": 0.0, "errmsg": "PlanExecutor error during aggregation :: caused by :: BSONObj size: 21648261 (0x14A5385) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: \"8265807427000000992B022C0100296E5A1004C640ECE10E3449F68DEB7B0962474E6246645F69640064655CB1169B71D38AFF5D6C650004\" }", "code": 10334, "codeName": "BSONObjectTooLarge", "$clusterTime": {"clusterTime": {"$timestamp": {"t": 1702924868, "i": 276}}, "signature": {"hash": {"$binary": {"base64": "KmlhjKxAaFXl8eeYHohBcTx+qMw=", "subType": "00"}}, "keyId": 7257082665252159808}}, "operationTime": {"$timestamp": {"t": 1702924868, "i": 275}}} com.mongodb.MongoCommandException: Command failed with error 10334 (BSONObjectTooLarge): 'PlanExecutor error during aggregation :: caused by :: BSONObj size: 21648261 (0x14A5385) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: "8265807427000000992B022C0100296E5A1004C640ECE10E3449F68DEB7B0962474E6246645F69640064655CB1169B71D38AFF5D6C650004" }' on server <redacted>. The full response is {"ok": 0.0, "errmsg": "PlanExecutor error during aggregation :: caused by :: BSONObj size: 21648261 (0x14A5385) is invalid. Size must be between 0 and 16793600(16MB) First element: _id: { _data: \"8265807427000000992B022C0100296E5A1004C640ECE10E3449F68DEB7B0962474E6246645F69640064655CB1169B71D38AFF5D6C650004\" }", "code": 10334, "codeName": "BSONObjectTooLarge", "$clusterTime": {"clusterTime": {"$timestamp": {"t": 1702924868, "i": 276}}, "signature": {"hash": {"$binary": {"base64": "KmlhjKxAaFXl8eeYHohBcTx+qMw=", "subType": "00"}}, "keyId": 7257082665252159808}}, "operationTime": {"$timestamp": {"t": 1702924868, "i": 275}}}
Do you see the same behaviour using the latest relesead Debezium version?
(Ideally, also verify with latest Alpha/Beta/CR version)
Yes.
Do you have the connector logs, ideally from start till finish?
As this is a production service, we cannot trigger the issue at will. The oversized messages appear intermittently. The current course of action is to clear the worker topics, thereby forcing the connector to start from the current position and skip the offending message. We have the logs from the previous incidents, but they are on the INFO level, not DEBUG/TRACE.
How to reproduce the issue using our tutorial deployment?
We think that the connector setup itself contains no special elements.