-
Bug
-
Resolution: Done
-
Blocker
-
4.3.0.Final
-
None
ModeShape uses several background GC tasks to cleanup repository data. One such task is the JCR lock garbage collector which cleans up orphaned locks from the repository. These tasks run periodically at either a preset or configured time interval.
When running in a cluster however, it can very well happen that these GC jobs fire all at the same time (or very close together) so that essentially all compete for the same ISPN data. In the case of the Lock GC collector, all of these tasks will attempt to lock the mode:locks root parent key first. If this happens, depending on how long a job takes, ISPN may fire TimeoutExceptions for the competing locks:
00:05:47,157 ERROR [org.modeshape.jcr.RepositoryLockManager] (modeshape-gc-8-thread-1) Error while cleaning up locks for the "dv" repository: org.modeshape.jcr.cache.WrappedException: org.modeshape.jcr.TimeoutException: Unable to acquire storage locks: [d7d2e6b317f1e7mode:locks] at org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:529) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3] at org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:421) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3] at org.modeshape.jcr.RepositoryLockManager.cleanupLocks(RepositoryLockManager.java:179) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3] at org.modeshape.jcr.JcrRepository$RunningState.cleanUpLocks(JcrRepository.java:1797) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3] at org.modeshape.jcr.JcrRepository$LockGarbageCollectionTask.doRun(JcrRepository.java:2132) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3] at org.modeshape.jcr.JcrRepository$BackgroundRepositoryTask.run(JcrRepository.java:2102) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_75] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [rt.jar:1.7.0_75] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [rt.jar:1.7.0_75] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_75] Caused by: org.modeshape.jcr.TimeoutException: Unable to acquire storage locks: [d7d2e6b317f1e7mode:locks] at org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:488) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3] ... 12 more Caused by: org.infinispan.util.concurrent.TimeoutException: Unable to acquire storage locks: [d7d2e6b317f1e7mode:locks] at org.modeshape.jcr.cache.document.WritableSessionCache.lockNodes(WritableSessionCache.java:1449) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3] at org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:451) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3] ... 12 more
We should make sure that for the GC jobs, whenever a TimeoutException occurs due to ISPN not being able to lock an entry, the exception is only logged and does not break the underlying node. The GC job is then re-ran later on, based on the configured timing.