Uploaded image for project: 'ModeShape'
  1. ModeShape
  2. MODE-1844

Repository restart fails if original start procedure was interrupted/failed

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Blocker Blocker
    • 3.2.0.Final
    • 3.1.3.Final
    • Server
    • None

      Inside RepositoryCache we currently have a flag named initializingRepository which was introduced so that multiple repository nodes can be synchronized within a cluster.

      This flag is set to true only the 1st time a repository starts up (i.e. does a fresh start).

      However, the following scenario may occur (in a local, non-clustered setup):

      • a repository starts up for the 1st time. The logic within RepositoryCache#init writes the document which contains the init data directly to the persistent store (without any transactions)
      • between RepositoryCache#init and the completeInitialization() method (which writes the REPOSITORY_INITIALIZED_AT_FIELD_NAME field) something unexpected occurs, causing the latter method to never be invoked and therefore the field never written
      • the repository is restarted, but
        a) it sees the REPOSITORY_INFO_KEY document (which was written directly to the persistent store the 1st time)
        b) it cannot find the REPOSITORY_INITIALIZED_AT_FIELD_NAME field

      In other words, it interprets the information as being in cluster and another node performing the initialization. Therefore, it enters into a waiting state (10 minutes) at the end of which it will crash because no one writes the REPOSITORY_INITIALIZED_AT_FIELD_NAME field.

      I've seen this occur when working with the AS7 kit, in case an integration test fails during Repository init: unless I perform a "mvn clean" - to remove the stored documents from the FS, the repository hangs and eventually crashes on restart.

              hchiorean Horia Chiorean (Inactive)
              hchiorean Horia Chiorean (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: