Uploaded image for project: 'ModeShape'
  1. ModeShape
  2. MODE-2662

Stale cache results in the inability to lock the node

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Do
    • Major
    • 5.4.0.Final
    • 5.3.0.Final
    • JCR
    • None
    • Hide

      Refer to the "Steps to Reproduce" section in the description of the JIRA.

      Show
      Refer to the "Steps to Reproduce" section in the description of the JIRA.

    Description

      Steps to Reproduce

      Checkout project [1] and invoke:

      mvn clean verify -Dtest=LockingBehaviorTest#addNodesInOrderNoTransaction
      

      Background

      Let us assume that we have a cluster that consists of 10 members. Using a single thread of execution, consider adding 100 child nodes to the appRoot, so that repository structure looks like this:

      - jcrRepositoryRoot
        -- appRoot
             --- childNode1
             ...
             --- childNode100
      

      Where:

      • The appRoot and childNodeN are versioned.
      • Every time a new child is about to be added:
        • Lock parent node.
        • Add child.
        • Unlock parent node.

      To simulate load balancing, e.g. Round-Robin, every time a request to handle addition of the new node comes in, the next available member of the cluster gets picked up (there is a kind of a circular iterator that gives back next available member). For instance (3 members of the cluster and 4 nodes to add):

      childNode1 -> member1
      childNode2 -> member2
      childNode3 -> member3
      childNode4 -> member1
      

      Problem

      At some point during the creation of the nodes, the LockException gets thrown. The exception indicates that the parent node is locked, therefore a new node cannot be added. It can be happen on any node, i.e. the order is not deterministic, but the exception happens consistently. In my understanding, this should not be happening, unless there is a bug in ModeShape or some misconfiguration of the JGroups.

      Questions

      1. With a single thread of execution and sequential successful lock/unlock operations, how is it possible for the parent node to remain locked on the next attempt to add a child node? Could it be that message delivery from one member of the cluster to others is too slow? To exemplify:

      First member of the cluster:

      1. Lock parent node, send notifications.
      2. Add new child node, save, send notifications.
      3. Unlock parent node, send notifications.

      Second member of the cluster:

      1. Attempt to lock parent node. However, the notification about node unlocking from the first member of the cluster has not arrived yet and the local cache indicates a locked status, therefore throw a LockException.

      2. Is it a problem with ModeShape or the the custom JGroups file [2], which is simply misconfigured in one way or another?

      [1] https://github.com/dnillia/modeshape-cluster-test
      [2] https://github.com/dnillia/modeshape-cluster-test/blob/master/src/test/resources/test-jgroups.xml

      Attachments

        Activity

          People

            hchiorean Horia Chiorean (Inactive)
            illia.khokholkov Illia Khokholkov (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: