Uploaded image for project: 'Infinispan'
  1. Infinispan
  2. ISPN-1182

Failure after TimeoutException during the restart of HotRod Server

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 5.0.0.CR6
    • 5.0.0.CR4
    • Remote Protocols
    • None

      Sometimes during restart of 3 or more HotRod nodes from 25-node cluster, I receive replication timeout exception, after which the node is unusable.
      The timeout comes from replacing the view in HotrodServer.addSelfToTopologyView. If 3 nodes try to replace the same element in cache at the same time, it's not a big surprise, that they fall into some kind of deadlock, which is properly recognized and broken after the timeout. But unfortunately the breaking exception is not handled and stops the HotRodServer start procedure. I suggest to catch it in addSelfToTopologyView like this:
      var updated = false
      try

      { updated = topologyCache.replace("view", currentView, newView) }

      catch

      { case e: TimeoutException => logUnableToReplaceView }

      This time the exception will not be thrown from the containing closure and updateTopologyView method will have the chance to replace the view again.

              rh-ee-galder Galder ZamarreƱo
              gerbszt_jira Jacek Gerbszt (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: