Uploaded image for project: 'ModeShape'
  1. ModeShape
  2. MODE-2485

Infinispan lock timeout exceptions should not break repository GC jobs

    XMLWordPrintable

Details

    Description

      ModeShape uses several background GC tasks to cleanup repository data. One such task is the JCR lock garbage collector which cleans up orphaned locks from the repository. These tasks run periodically at either a preset or configured time interval.

      When running in a cluster however, it can very well happen that these GC jobs fire all at the same time (or very close together) so that essentially all compete for the same ISPN data. In the case of the Lock GC collector, all of these tasks will attempt to lock the mode:locks root parent key first. If this happens, depending on how long a job takes, ISPN may fire TimeoutExceptions for the competing locks:

      00:05:47,157 ERROR [org.modeshape.jcr.RepositoryLockManager] (modeshape-gc-8-thread-1) Error while cleaning up locks for the "dv" repository: org.modeshape.jcr.cache.WrappedException: org.modeshape.jcr.TimeoutException: Unable to acquire storage locks: [d7d2e6b317f1e7mode:locks]
      	at org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:529) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3]
      	at org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:421) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3]
      	at org.modeshape.jcr.RepositoryLockManager.cleanupLocks(RepositoryLockManager.java:179) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3]
      	at org.modeshape.jcr.JcrRepository$RunningState.cleanUpLocks(JcrRepository.java:1797) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3]
      	at org.modeshape.jcr.JcrRepository$LockGarbageCollectionTask.doRun(JcrRepository.java:2132) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3]
      	at org.modeshape.jcr.JcrRepository$BackgroundRepositoryTask.run(JcrRepository.java:2102) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3]
      	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [rt.jar:1.7.0_75]
      	at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [rt.jar:1.7.0_75]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [rt.jar:1.7.0_75]
      	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [rt.jar:1.7.0_75]
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [rt.jar:1.7.0_75]
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [rt.jar:1.7.0_75]
      	at java.lang.Thread.run(Thread.java:745) [rt.jar:1.7.0_75]
      Caused by: org.modeshape.jcr.TimeoutException: Unable to acquire storage locks: [d7d2e6b317f1e7mode:locks]
      	at org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:488) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3]
      	... 12 more
      Caused by: org.infinispan.util.concurrent.TimeoutException: Unable to acquire storage locks: [d7d2e6b317f1e7mode:locks]
      	at org.modeshape.jcr.cache.document.WritableSessionCache.lockNodes(WritableSessionCache.java:1449) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3]
      	at org.modeshape.jcr.cache.document.WritableSessionCache.save(WritableSessionCache.java:451) [modeshape-jcr-3.8.3.GA-redhat-3.jar:3.8.3.GA-redhat-3]
      	... 12 more
      

      We should make sure that for the GC jobs, whenever a TimeoutException occurs due to ISPN not being able to lock an entry, the exception is only logged and does not break the underlying node. The GC job is then re-ran later on, based on the configured timing.

      Attachments

        Activity

          People

            hchiorean Horia Chiorean (Inactive)
            hchiorean Horia Chiorean (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: