Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-701

Investigate making FD_ALL the default failure detection protocol

XMLWordPrintable

    • Icon: Task Task
    • Resolution: Done
    • Icon: Minor Minor
    • 2.7
    • None
    • None

      Replacing FD, so the new combo would be FD_SOCK and FD_ALL. The advantage of FD_ALL is that we detect many concurrent failures much faster than FD.

      E.g if we have A,B,C,D,E and B and C fail, and FD.timeout=1000 and FD.max_tries=3, FD_ALL.timeout=3000 and FD_ALL.interval=1000, then:

      FD will take timeout * max_tries ms to detect the death of B and the same for C, so a total of 6 seconds.
      FD_ALL will take timeout ms to detect both failures, so a total of 3 seconds.

      FD_ALL is also much simpler in its implementation, therefore simpler to verify for correctness

              vblagoje Vladimir Blagojevic (Inactive)
              rhn-engineering-bban Bela Ban
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved: