Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-659

Merge and UNICAST sequencing problem

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 2.8
    • 2.4, 2.5, 2.6
    • None

      The problem is related to trashing of connection table in UNICAST during merge. Consider following scenario:

      There are 4 nodes in a cluster A,B,C, and D. After network split we have two islands A,B and C,D. When the network healing starts eventually MergeView gets installed in both islands. MergeView installation causes trashing of UNICAST connection table [1].

      However if we have a scenario where MergeView gets installed in A,B island at time T and it gets installed in island C,D at time T+N msec and a node from island A,B sends a unicast message in this N msec time window then we'll run into problems with unicast sequencing at C and D. Why? Because next message coming from island A,B into C,D will be will with sequence number > 1 and sequencing in UNICAST of C,D after connection trashing (from merge) expects starting sequence of 1. This causes UNICAST in C and/or D to wait forever for missing messages. Final outcome is thus that no more unicast message coming from A and/or B will ever be delivered at C and/or D!

      [1]http://jira.jboss.com/jira/browse/JGRP-348

              rhn-engineering-bban Bela Ban
              vblagoje Vladimir Blagojevic (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: