Uploaded image for project: 'Debezium'
  1. Debezium
  2. DBZ-3522

Incorrectly identifies primary member of replica set

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 1.6.0.Beta2
    • 1.4.2.Final
    • mongodb-connector
    • None
    • False
    • False
    • Undefined

      Debezium incorrectly identifies primary member of replica set. As a result, Debezium doesn't always connect and replicate from the primary oplog (luck of the draw). Furthermore, if the incorrectly selected replica set host is out of the shard for maintenance or other reasons, Debezium errors out.

      https://github.com/debezium/debezium/blob/1.4/debezium-connector-mongodb/src/main/java/io/debezium/connector/mongodb/MongoUtil.java#L336-L345
       

      ServerAddress serverAddress = serverDescriptions.get(0).getAddress();
      

       
      Instead of assuming that the primary address is in zero position of the list of servers in the replica-set (via the MongoClient created with the replica set members), there should be loop that traverses all of the serverDescriptions looking for isPrimary == true.

      This issue has led to significant downtime, preventing us from being able to perform maintenance on our sharded cluster. Debezium randomly picks from a list of replica set hosts provided during discovery against our Mongo configuration replica set.

      For example the maintenance that we're currently trying to perform on a replica set member requires a re-sync (up-sizing disk), thus it needs to remain in replica set, but not as a primary or secondary node (start up mode). The Mongo Configuration replica set still reports this node as a member of the replica set to debezium, as it should. Debezium does not check for the primary node correctly and sometimes selects this node even though it's in a start-up mode. Since we have 3 nodes in a replica set, there's a 1 in 3 chance that debezium will fail during maintenance.

      This issue potentially affects 1.4 and up.

              Unassigned Unassigned
              ccollingwood-spireon Chris Collingwood (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

                Created:
                Updated:
                Resolved: