Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-2316

Distributed deadlock in StateTransferInterceptor

    XMLWordPrintable

Details

    Description

      When using transactions, a distributed deadlock may occur when a node is joining under these circumstances:

      1) the new node requests transactions using GET_TRANSACTIONS
      2) the old node tries to commit a transaction, broadcasting PrepareCommand - in StateTransferIntreceptor it locks the transactionLock in shared way
      3) the request GET_TRANSACTIONS comes on the new node, the node is waiting for the transactionLock (it requires it exclusively)
      4) transaction commit on new node is waiting for the commandsLock (requires this in shared way) but it is locked exclusively by the onTopologyUpdate - addTransfer - requestTransactions ( = synchronous GET_TRANSACTIONS).

      Found in some traces, but not required:
      After the transaction commit times out on old node releasing the lock, the GET_TRANSACTION request may continue, but the state transfer itself can also timeout if not set properly longer.
      The transaction commit continues on the new node after the ST times out, until it is found invalid (rolled back).

      Attachments

        There are no Sub-Tasks for this issue.

        Activity

          People

            dberinde@redhat.com Dan Berindei (Inactive)
            rvansa1@redhat.com Radim Vansa (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: