Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-6871

ChangeStream aggregation pipeline fails on large documents which should be excluded

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 2.4.0.Beta2
    • 2.3.3.Final, 2.4.0.Beta1
    • mongodb-connector
    • None
    • False
    • None
    • False
    • Important

      The first stage of our internal aggregation pipeline manipulates the entire change event document in order to create the namespace filed (since our collection include/exclude list operate over fully classified collection name)

       { "$replaceRoot": { "newRoot": { "event": "$$ROOT", "namespace": { "$concat": [ "$ns.db", ".", "$ns.coll" ] }}}},
      

      Combined with the fact that our change stream is deployment / database scoped this has an unfortunate side effect of failing when the full change event violates the maximum BSON size regardless if the document should be filtered out based on include / exclude list properties.

      A seemingly robust solution is adding the following stage as the first to the effective aggregation pipeline

      [
          {
              "$match": {
                  "$and": [
                      {
                          "$expr": {
                              "$lte": [
                                  {
                                      "$bsonSize": "$fullDocument"
                                  },
                                  8000
                              ]
                          }
                      },
                      {
                          "$expr": {
                              "$lte": [
                                  {
                                      "$bsonSize": "$fullDocumentBeforeChange"
                                  },
                                  8000
                              ]
                          }
                      }
                  ]
              }
          }
      ]
      

      Where 8000 is an arbitrary configurable value

            jcechace@redhat.com Jakub Čecháček
            jcechace@redhat.com Jakub Čecháček
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: