-
Bug
-
Resolution: Done
-
Major
-
None
-
None
A merge involves consolidating digests from all members. However, we don't get the digests directly from involved members, but also include 'hearsay', e.g. in A:
{A}and B:
{B,C}, B adds digest information from C. This is what I call 'hearsay', the digest information about C isn't provided by C directly, but by B.
If B hasn't been able to communicate with C for a while, e.g. because B's link to C was asymmetrical, then B's information about C's digest might be incorrect.
E.g. if B returns #3 as C's highest seqno, but in effect C's highest seqno is #20, then members other than C will not be able to get messages C#3 - C#20 from C. They would therefore queue all subsequent messages from C !
SOLUTION:
- On merging, partition coordinators cannot just return their digests, but have to fetch the digests for the partition from all members
- The digest consolidation only accepts digests from direct members, e.g. when B returns a digest of (B:24,C:7,D:15) we only use the B part of it
- Simplification of the above: have members only return their lowest, highest received and highest delivered seqnos, but not seqnos they 'know' (hearsay) from other members
- When a merge coordinator gets digest information from
{A,B,D}
in a merge between
{A,B}and
{C,D}, then it has to fail the merge because it is missing innformation from C !