Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-1179

Incoming PingRsp is ignored despite being sent by a Coordinator.

    Details

      Description

      I launch successively (nearly simultaneously) 5 nodes A B C D E using the same protocol stack and one channel to communicate between themselves.

      UDP(mcast_addr=231.8.8.8;mcast_port=45578):PING(num_initial_members=4):MERGE2:FD:VERIFY_SUSPECT:pbcast.NAKACK:pbcast.STABLE:FRAG2:pbcast.GMS(shun=true):pbcast.FLUSH

      Often as not, it depends on the speed/rythm between each node launch, I get 2 views, ie

      {D}

      and

      {A B C E}

      .

      Merge occurs later but when it does it's a bit late for my application and I don't think I should have to handle one save in case of a real electric/network failure.

      I noticed that on D I was timing out (3000ms) on during the discovery process despite having received the 4 GET_MBRS_RSP of the other nodes. Then D would decide there was no coordinator outside and become coordinator itself.

      What seems to happen is D sends two GET_MBRS_REQ and A replies to both, but at the time of the first reply, A is not yet coordinator and when D receives the second response, A became coordinator but D ignores the response and doesn"t add it to its list of Responses.

      I have written a workaround in Discovery.Responses method addResponse, it seems to work for my case but I am afraid it would break something else I am not aware of.

      public void addResponse(PingRsp rsp) {
      if(rsp == null)
      return;
      promise.getLock().lock();
      try {
      //Workaround 29/03/2010
      int index = ping_rsps.indexOf(rsp);

      // equivalent to does not contain.
      if (index == -1)

      { ping_rsps.add(rsp); promise.getCond().signalAll(); }

      else if (rsp.isCoord()) {
      PingRsp pr = ping_rsps.get(index);

      //Check if the already existing element is not server
      if (!pr.isCoord())

      { ping_rsps.set(index, rsp); promise.getCond().signalAll(); }

      }

      /*if(!ping_rsps.contains(rsp))

      { ping_rsps.add(rsp); promise.getCond().signalAll(); }

      */ // Old JGroups code
      }
      finally

      { promise.getLock().unlock(); }

      }

      Regards
      Renaud

        Gliffy Diagrams

          Attachments

            Activity

              People

              • Assignee:
                belaban Bela Ban
                Reporter:
                rddx Renaud Devarieux
              • Votes:
                1 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: