Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-699

NAKACK: merging of digests is incorrect

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 2.6.3, 2.7
    • 2.4, 2.4.1, 2.4.1 SP1, 2.4.1 SP2, 2.4.1 SP3, 2.4.1 SP4, 2.4.2
    • None

      The merge view is:

      2008-02-20 23:49:30,872 DEBUG [org.jgroups.protocols.pbcast.GMS]
      view=MergeView::[172.16.172.233:19382|3] [172.16.172.233:19382, 172.16.172.234:19382],
      subgroups=[[172.16.172.233:19382|1] [172.16.172.233:19382, 172.16.172.234:19382],
      [172.16.172.234:19382|2] [172.16.172.234:19382]],
      digest=172.16.172.233:19382: [557 : 564(564)], 172.16.172.234:19382: [901 : 923 (923)]

      So, the digest for 234:19382 is 901-923, which means 234 could actually satisfy the retransmit request below. However, when the xmit req arrives:

      2008-02-20 23:49:32,004 ERROR [org.jgroups.protocols.pbcast.NAKACK]
      (requester=172.16.172.233:19382, local_addr=172.16.172.234:19382)
      message 172.16.172.234:19382::919 not found in retransmission table of
      172.16.172.233:19382:[557 : 565 (565)]
      172.16.172.234:19382: [921 : 924 (924) (size=3, missing=0, highest stability=921)]

      , we can see that now the digest for 234:19382 is 921-924.

      In 2.4, the merge algorith for digests (NAKACK.mergeDigest()) takes the max of the low seqnos (MAX(901,921), in 2.5 and higher we don't touch an existing entry, so the digest for 234:19382 would still be 901-923, and we could therefore satisfy the merge request.

      SOLUTION: use the solution implemented in 2.5 and higher: don't set digests for existing entries.

              rhn-engineering-bban Bela Ban
              rhn-engineering-bban Bela Ban
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: