Resolution: Cannot Reproduce
4.0.0.ALPHA1, 5.1.6.FINAL
I've encountered a lot of TimeoutExceptions just running a load test against an infinispan cluster.
I tracked down the reason and found out, that the code in org.infinispan.util.concurrent.locks.containers.AbstractPerEntryLockContainer#releaseLock() causes these superfluous TimeoutExceptions.
A small test case (which just prints out timeouts, too late timeouts and "paints" a lot of dots to the console - more dots/second on the console means better throughput
In a short test I extended the class ReentrantPerEntryLockContainer and changed the implementation of releaseLock() as follows:
public void releaseLock(Object lockOwner, Object key) { ReentrantLock l = locks.get(key); if (l != null) { if (!l.isHeldByCurrentThread()) throw new IllegalStateException("Lock for [" + key + "] not held by current thread " + Thread.currentThread()); while (l.isHeldByCurrentThread()) unlock(l, lockOwner); if (!l.hasQueuedThreads()) locks.remove(key); } else throw new IllegalStateException("No lock for [" + key + ']'); }
The main improvement is that locks are not removed from the concurrent map as long as other threads are waiting on that lock.
If the lock is removed from the map while other threads are waiting for it, they may run into timeouts and force TimeoutExceptions to the client.
The above methods "paints more dots per second" - means: it gives a better throughput for concurrent accesses to the same key.
The re-implemented method should also fix some replication timeout exceptions.
Please, please add this to 5.1.7, if possible.
- blocks
ISPN-2244 Transparently hold serialized representations of keys and values
- To Do