Details

    • Type: Bug
    • Status: Resolved (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: 5.2.7.Final
    • Fix Version/s: None
    • Component/s: State Transfer
    • Labels:
      None
    • Steps to Reproduce:
      Hide

      Modify TestingUtil#waitForRehashToComplete
      log.trace("Node " + cacheAddress + " finished state transfer.");
      + CacheTopology cacheTopology = stateTransferManager.getCacheTopology();
      + log.warn ( "Topology: " + cacheTopology.getCurrentCH() + " – " + cacheTopology.getCurrentCH().getRoutingTableAsString() );

      Run RehashAfterPartitionMergeTest and observe logs after the merge.

      Show
      Modify TestingUtil#waitForRehashToComplete log.trace("Node " + cacheAddress + " finished state transfer."); + CacheTopology cacheTopology = stateTransferManager.getCacheTopology(); + log.warn ( "Topology: " + cacheTopology.getCurrentCH() + " – " + cacheTopology.getCurrentCH().getRoutingTableAsString() ); Run RehashAfterPartitionMergeTest and observe logs after the merge.

      Description

      After a cluster split and merge, the consistent hash is not balanced between the members.

      For example in a 2-member cluster, after the merge one node will be primary owner of every segment. In In a larger cluster, some nodes will not own any data.

      DefaultConsistentHash

      {numSegments=60, numOwners=2, members=[RehashAfterPartitionMergeTest-NodeB-49100, RehashAfterPartitionMergeTest-NodeA-11552]}

      – 0: 0 1, 1: 0 1, 2: 0 1, 3: 0 1, 4: 0 1, 5: 0 1, 6: 0 1, 7: 0 1, 8: 0 1, 9: 0 1, 10: 0 1, 11: 0 1, 12: 0 1, 13: 0 1, 14: 0 1, 15: 0 1, 16: 0 1, 17: 0 1, 18: 0 1, 19: 0 1, 20: 0 1, 21: 0 1, 22: 0 1, 23: 0 1, 24: 0 1, 25: 0 1, 26: 0 1, 27: 0 1, 28: 0 1, 29: 0 1, 30: 0 1, 31: 0 1, 32: 0 1, 33: 0 1, 34: 0 1, 35: 0 1, 36: 0 1, 37: 0 1, 38: 0 1, 39: 0 1, 40: 0 1, 41: 0 1, 42: 0 1, 43: 0 1, 44: 0 1, 45: 0 1, 46: 0 1, 47: 0 1, 48: 0 1, 49: 0 1, 50: 0 1, 51: 0 1, 52: 0 1, 53: 0 1, 54: 0 1, 55: 0 1, 56: 0 1, 57: 0 1, 58: 0 1, 59: 0 1

      This is triggered consistently by the RehashAfterPartitionMergeTest test case, but is not caught because it does not sufficiently check the consistent hash. (it checks RebalancePolicy.isBalanced, which merely makes sure each segment has the correct number of owners, not that it's evenly distributed).

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                pjurak Petr Jurak
                Reporter:
                dereed Dennis Reed
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: