Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-6871

ChangeStream aggregation pipeline fails on large documents which should be excluded

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 2.4.0.Beta2
    • 2.3.3.Final, 2.4.0.Beta1
    • mongodb-connector
    • None
    • False
    • None
    • False
    • Important

      The first stage of our internal aggregation pipeline manipulates the entire change event document in order to create the namespace filed (since our collection include/exclude list operate over fully classified collection name)

       { "$replaceRoot": { "newRoot": { "event": "$$ROOT", "namespace": { "$concat": [ "$ns.db", ".", "$ns.coll" ] }}}},
      

      Combined with the fact that our change stream is deployment / database scoped this has an unfortunate side effect of failing when the full change event violates the maximum BSON size regardless if the document should be filtered out based on include / exclude list properties.

      A seemingly robust solution is adding the following stage as the first to the effective aggregation pipeline

      [
          {
              "$match": {
                  "$and": [
                      {
                          "$expr": {
                              "$lte": [
                                  {
                                      "$bsonSize": "$fullDocument"
                                  },
                                  8000
                              ]
                          }
                      },
                      {
                          "$expr": {
                              "$lte": [
                                  {
                                      "$bsonSize": "$fullDocumentBeforeChange"
                                  },
                                  8000
                              ]
                          }
                      }
                  ]
              }
          }
      ]
      

      Where 8000 is an arbitrary configurable value

              jcechace@redhat.com Jakub Čecháček
              jcechace@redhat.com Jakub Čecháček
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: