Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-603

FLUSH: problems with TCP and concurrent startup/shutdowns

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Critical Critical
    • 2.6
    • None
    • None

      The attached ZIP file has code that reproduces this.

      Modify props.props (pf.cluster.transport.protocol=udp) to "tcp" and "tcp-nio" if you want to test the different stacks.

      To reproduce:

      • Start a number of instances (e.g. 5) concurrently. This almost never works, even under "udp". Joiners' JOIN requests time out and they have to retry (possibly because the coord is busy with the FLUSH protocol). They become singleton members and never merge !
      • This works fine without FLUSH
      • With "udp", it works almost always, with "tcp" it works 50% of the time, with "tcp-nio" is almost never works
      • Randomly kill and restart instances

        1. 105.txt
          162 kB
          Bela Ban
        2. 104.txt
          70 kB
          Bela Ban
        3. 103.txt
          340 kB
          Bela Ban
        4. 3.txt
          60 kB
          Bela Ban
        5. 2.txt
          85 kB
          Bela Ban
        6. 1.txt
          85 kB
          Bela Ban

            vblagoje Vladimir Blagojevic (Inactive)
            rhn-engineering-bban Bela Ban
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: