Uploaded image for project: 'Red Hat Data Grid'
  1. Red Hat Data Grid
  2. JDG-5087

Deadlock when putAll writes to expired entries with optimistic locking

    XMLWordPrintable

Details

    • Bug
    • Resolution: Unresolved
    • Major
    • None
    • RHDG 8.2.1 GA
    • API and Configuration
    • None

    Description

      When an entry is expired, ClusterExpirationManager invokes cache.removeLifespan/MaxIdleExpired() to actually remove the entry and confirm that it has been removed (or that it's still alive). The remove command may or may not try to acquire a lock, depending on whether the expiration was triggered by a read operation or by a write.

      In transactional caches with optimistic locking, TxClusterExpirationManager goes further and the remove command acquires locks even when the expiration was triggered by a write. This is correct when the write is running on the originator node, but not when the write is replayed on the key owners, during the execution of (Versioned)PrepareCommand.

      Most of the time this does not cause problems: the originator executes the write locally before sending it to the primary owners. If the key is not local, it sends a ClusteredGetCommand to the owners, and the entry is expired then, meaning RemoveExpiredCommand will only run (and deadlock) if the entry expires in the short window between the read and the prepare.

      There is an exception, though: PutMapCommand does not read the previous value on the originator, unless it is an owner of the key (and then only so it can notify listeners). That means even if the entry has expired long ago, RemoveExpiredCommand will only be invoked when the owner is replaying the PutMapCommand during the execution of PrepareCommand, causing a deadlock.

      Attachments

        Issue Links

          Activity

            People

              wburns@redhat.com Will Burns
              rhn-support-wfink Wolf Fink
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: