-
Bug
-
Resolution: Done
-
Critical
-
5.2.0.Final
BaseDistributionInterceptor.shouldFetchFromRemote() and TxDistributionInterceptor.remoteGetAndStoreInL1() can mistakenly decide not to fetch remotely because they check the presence of the key in data container. The key may be there now but it was not there before the local execution if state transfer was in progress for this key. So it should be re-fetched rather than use the null result.
This makes StateTransferLargeObjectTest.testForFailure fail randomly.
The failure appears because the state transfer has not finished, yet the distribution interceptor doesn't go remotely for the key:
14:06:45,872 TRACE (asyncTransportThread-1,NodeA:___defaultcache) [StateTransferManagerImpl] Installing new cache topology CacheTopology{id=7, currentCH=DefaultConsistentHash{numSegments=60, numOwners=3, members=[NodeA-52814, NodeB-62397, NodeC-63995], owners={0: 0 1 2, 1: 0 1 2, 2: 0 1 2, 3: 0 1 2, 4: 0 1 2, 5: 0 1 2, 6: 0 1 2, 7: 0 1 2, 8: 0 1 2, 9: 0 1 2, 10: 0 1 2, 11: 0 1 2, 12: 0 2, 13: 0 2, 14: 0 2, 15: 0 1, 16: 0 1, 17: 0 1, 18: 0 1, 19: 0 1, 20: 2 0, 21: 2 0, 22: 2 0, 23: 2 0, 24: 2 0, 25: 2 0, 26: 2 0, 27: 2 0, 28: 2 0, 29: 2 0, 30: 1 0 2, 31: 1 0 2, 32: 1 0 2, 33: 1 2, 34: 1 0, 35: 1 2, 36: 1 0, 37: 1 2, 38: 1 0, 39: 1 2, 40: 1 0, 41: 1 2, 42: 1 0, 43: 1 2, 44: 1 0, 45: 1 0, 46: 1 0, 47: 1 0, 48: 1 0, 49: 1 2, 50: 2 1, 51: 2 1, 52: 2 1, 53: 2 1, 54: 2 1, 55: 2 1, 56: 2 1, 57: 2 1, 58: 2 0, 59: 2 0}, pendingCH=DefaultConsistentHash{numSegments=60, numOwners=3, members=[NodeA-52814, NodeB-62397, NodeC-63995], owners={0: 0 1 2, 1: 0 1 2, 2: 0 1 2, 3: 0 1 2, 4: 0 1 2, 5: 0 1 2, 6: 0 1 2, 7: 0 1 2, 8: 0 1 2, 9: 0 1 2, 10: 0 1 2, 11: 0 1 2, 12: 0 2 1, 13: 0 2 1, 14: 0 2 1, 15: 0 1 2, 16: 0 1 2, 17: 0 1 2, 18: 0 1 2, 19: 0 1 2, 20: 2 0 1, 21: 2 0 1, 22: 2 0 1, 23: 2 0 1, 24: 2 0 1, 25: 2 0 1, 26: 2 0 1, 27: 2 0 1, 28: 2 0 1, 29: 2 0 1, 30: 1 0 2, 31: 1 0 2, 32: 1 0 2, 33: 1 2 0, 34: 1 0 2, 35: 1 2 0, 36: 1 0 2, 37: 1 2 0, 38: 1 0 2, 39: 1 2 0, 40: 1 0 2, 41: 1 2 0, 42: 1 0 2, 43: 1 2 0, 44: 1 0 2, 45: 1 0 2, 46: 1 0 2, 47: 1 0 2, 48: 1 0 2, 49: 1 2 0, 50: 2 1 0, 51: 2 1 0, 52: 2 1 0, 53: 2 1 0, 54: 2 1 0, 55: 2 1 0, 56: 2 1 0, 57: 2 1 0, 58: 2 0 1, 59: 2 0 1}} on cache ___defaultcache 14:06:46,287 TRACE (OOB-9,ISPN,NodeA-52814:___defaultcache) [StateConsumerImpl] Received keys [0, 2, 10, 94, 103, 117, 187, 189, 288, 305, 307, 376, 481, 487, 502, 729, 771, 994] for segment 50 of cache ___defaultcache from node NodeB-62397 14:06:46,338 INFO (testng-StateTransferLargeObjectTest:) [StateTransferLargeObjectTest] ----Running a get on 10 14:06:46,351 TRACE (OOB-9,ISPN,NodeA-52814:___defaultcache ___defaultcache) [InvocationContextInterceptor] Invoked with command PutKeyValueCommand{key=10, value=org.infinispan.statetransfer.BigObject@6ed3126d, flags=[CACHE_MODE_LOCAL, SKIP_REMOTE_LOOKUP, PUT_FOR_STATE_TRANSFER, SKIP_SHARED_CACHE_STORE, SKIP_OWNERSHIP_CHECK, IGNORE_RETURN_VALUES, SKIP_XSITE_BACKUP], putIfAbsent=false, lifespanMillis=-1, maxIdleTimeMillis=-1, successful=true} and InvocationContext [org.infinispan.context.impl.LocalTxInvocationContext@2c6dd013] 14:06:46,358 TRACE (testng-StateTransferLargeObjectTest:___defaultcache) [GetKeyValueCommand] Entry not found 14:06:46,358 TRACE (testng-StateTransferLargeObjectTest:___defaultcache) [BaseDistributionInterceptor] Not doing a remote get for key 10 since entry is mapped to current node (NodeA-52814), or is in L1. Owners are [NodeC-63995, NodeB-62397, NodeA-52814] 14:06:46,359 ERROR (testng-StateTransferLargeObjectTest:) [UnitTestTestNGListener] Test testForFailure(org.infinispan.statetransfer.StateTransferLargeObjectTest) failed. java.lang.AssertionError: expected object to not be null at org.testng.Assert.fail(Assert.java:89) at org.testng.Assert.assertNotNull(Assert.java:399) at org.testng.Assert.assertNotNull(Assert.java:384) at org.infinispan.statetransfer.StateTransferLargeObjectTest.assertValue(StateTransferLargeObjectTest.java:145) at org.infinispan.statetransfer.StateTransferLargeObjectTest.testForFailure(StateTransferLargeObjectTest.java:115)