Uploaded image for project: 'JBoss Enterprise Application Platform'
  1. JBoss Enterprise Application Platform
  2. JBEAP-4620

StateTransferManager should be the first component to stop

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 7.0.1.CR1, 7.0.1.GA
    • 7.0.0.GA
    • None
    • None
    • EAP 7.0.1

      When a cache stops, it first removes the component registry from the GlobalComponentsRegistry's namedComponents map, which means the node (let's call it A) will reply with a CacheNotFoundResponse to any remote command.

      Another node B trying to execute a write/transactional command will receive the CacheNotFoundResponse, assume that a new cache topology with id current topology id + 1 is coming soon, and wait for that new topology before retrying.

      Normally this is not a problem, because StateTransferManagerImpl.stop() sends a CacheTopologyControlCommand(LEAVE) to the coordinator quickly enough, then B receives the current topology id + 1 topology and retries the command.

      But in some cases, the cache components that stop before StateTransferManagerImpl can take a long time to do so. In particular, because of ISPN-5507, TransactionTable can block for cacheStopTimeout if there are remote transactions in progress, even though the cache can no longer process remote commands.

      We should give StateTransferManagerImpl.stop() a priority of 0, so that the CacheTopologyControlCommand(LEAVE) comand is sent as soon as possible.

              dberinde@redhat.com Dan Berindei (Inactive)
              rhn-support-bmaxwell Brad Maxwell
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: