Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-925

race condition in DistributionManagerImpl


      This is causing StateTransferLargeObjectTest to fail intermittently (about 1/500 runs).

      Nasty race condition in DistributionManagerImpl
      1. if a node leaves then the new consistent hash is first set ( consistentHash = ConsistentHashHelper.removeAddress(consistentHash, leaver, configuration, topologyInfo)
      2. then an InvertedLeaveTask is triggered if needed
      3 this would add the leaver to DMI.levers and set the rehashInProgress flag of DMImpl to true (RehashTask.call)

      Now if a get call happens between 1 and 3 then then the the system would not go remotley. The "go remotly if there's a rehash going on" condition happens in DistInterceptor.visitGetKeyValueCommand:
      boolean isRehashInProgress = !dm.isJoinComplete() || dm.isRehashInProgress();
      if the isRehashInProgress is set to true then we go remotly even if the key is mapped to the local node. Nasty!

            mircea.markus Mircea Markus (Inactive)
            mircea.markus Mircea Markus (Inactive)
            0 Vote for this issue
            0 Start watching this issue