Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-217

unplugging of network cable makes jgroups behave weirdly

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 2.4
    • 2.2.8, 2.2.9, 2.3
    • None
    • Workaround Exists
    • Hide

      The only workaround currently available is to use UDP rather than TCP

      Show
      The only workaround currently available is to use UDP rather than TCP

      Hi,
      I am using jgroups 2.2.9 for communication in my cluster.
      I was checking the behaviour of jgroups to network fluctuation.
      for that I manually unplug the network cable and plug it back after some time.
      But jgroups keeps sending suspected messages and never bring that into the cluster.

      I have two machines in my cluster A(10.200.1.92) & B(10.200.1.74).
      I unplug the cable in machine B. and plug it back . but B never joined the cluster.
      Please find the attachment for the log files from both the machines.

      I am using TCP based communication. I set up my protocol stack using . MPING,MERGE2,FD,VERIFY_SUSPECT,pbcast.NAKACK,pbcast.STABLE,VIEW_SYNC,pbcast.GMS.

      The following are the confuguration in both the machines.

      configuration setting in Machine-A
      notificationbus.properties:
      TCP(bind_addr=boca;recv_buf_size=200000;send_buf_size=100000;loopback=true;
      start_port=7880):
      MPING(timeout=4000;bind_to_all_interfaces=true;mcast_addr=228.8.8.8;mcast_port=7500;
      ip_ttl=8;num_initial_members=4;num_ping_requests=1):
      MERGE2(max_interval=10000;min_interval=5000):
      FD(timeout=2000;max_tries=3):
      VERIFY_SUSPECT(timeout=1500;down_thread=false;up_thread=false):
      pbcast.NAKACK(max_xmit_size=60000;gc_lag=50;
      retransmit_timeout=100,200,300,600,1200,2400,4800):
      pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;down_thread=false;
      up_thread=false;max_bytes=0):
      VIEW_SYNC(avg_send_interval=60000;down_thread=false;up_thread=false):
      pbcast.GMS(print_local_addr=true;join_timeout=5000;join_retry_timeout=2000;shun=true)

      -------------------------------

      configuration setting in Machine-B
      notificationbus.properties:
      TCP(bind_addr=rizal;recv_buf_size=200000;send_buf_size=100000;
      loopback=true;start_port=7880):
      MPING(timeout=4000;bind_to_all_interfaces=true;mcast_addr=228.8.8.8;
      mcast_port=7500;ip_ttl=8;num_initial_members=4;num_ping_requests=1):
      MERGE2(max_interval=10000;min_interval=5000):FD(timeout=2000;max_tries=3):
      VERIFY_SUSPECT(timeout=1500;down_thread=false;up_thread=false):
      pbcast.NAKACK(max_xmit_size=60000;gc_lag=50;
      retransmit_timeout=100,200,300,600,1200,2400,4800):
      pbcast.STABLE(stability_delay=1000;desired_avg_gossip=20000;down_thread=false;
      up_thread=false;max_bytes=0):
      VIEW_SYNC(avg_send_interval=60000;down_thread=false;up_thread=false):
      pbcast.GMS(print_local_addr=true;join_timeout=5000;join_retry_timeout=2000;shun=true)
      ----------------

        1. mping.xml
          1 kB
          Bela Ban
        2. pull-tcp.xml
          2 kB
          Bela Ban
        3. pulltheplug.xml
          2 kB
          Bela Ban

              rhn-engineering-bban Bela Ban
              msahu_jira Manoranjan Sahu (Inactive)
              Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: