Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-2266

RouterStubManager.run() endless reconnect loop burning a CPU

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 4.0.12
    • 4.0.11
    • None

      RouterStubManager.run() tries in a loop to reconnect all stubs currently not connected. When for whatever reason it is not possible to connect one of this stubs, the method spins in a endless loop and burns a CPU.

      E.g. sometimes the VPN tunnel is down or one of the TCPGOSSIP hosts is down.

      No idea if it is really required to loop here, but at least it should do some some Thread.yield() or or sleep() here. As this run() method is called periodically it should not be required to do a endless loop here, should it? Maybe only loop e.g. three times and then give up?

      As the all nodes in the cluster are iMac workstations or special render Linux slaves, burning a CPU is very annoying. The CPU should rather be spend on the Blender render jobs or for the interactive work the people are doing on their iMacs. (JGroups is used here to distribute render jobs within the cluster)

        1. cs_stack.xml
          3 kB
          Emmeran Seehuber

              rhn-engineering-bban Bela Ban
              emmeran.seehuber Emmeran Seehuber (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: