Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-2461

Clustering can fail when re-adding an existing node using TCP_NIO2

    XMLWordPrintable

Details

    • Bug
    • Resolution: Won't Do
    • Major
    • None
    • 4.1.8
    • None
    • Hide

      This is for an IPV4 network (although I think it is identical on IPV6. My only question is the IpAddress ordering in IPV6).

      1. You need at least two nodes (the cluster I found this on uses 3). Make sure that the lowest IP address node is the coordinator.
      2. Use the standard TCP_NIO2 protocol stack for the cluster. Our environment has the following characteristics that might be important:
        1. Port range is 0 for TCP_NIO2 and TCPPING
        2. TCP_NIO2 uses the external IP address as the bind address
        3. "use_ip_addrs" is true
        4. The highest IP address node is not included in the initial hosts for TCPPING. (This does not appear to be absolutely necessary, but it because of the the address caching behavior it is much more likely to create the conditions).
      3. After the cluster is established, have the highest IP address node continually leave and rejoin the cluster. My code creates a new JChannel object for each rejoin (closing the old one in order to leave the cluster), but I am pretty sure it will work if you reuse the JChannel.

      Eventually you will get an attempted rejoin where the highest IP address node sees itself as the only node in the cluster. The coordinator will send requests to it to merge, but they will never be received.

      Show
      This is for an IPV4 network (although I think it is identical on IPV6. My only question is the IpAddress ordering in IPV6). You need at least two nodes (the cluster I found this on uses 3). Make sure that the lowest IP address node is the coordinator. Use the standard TCP_NIO2 protocol stack for the cluster. Our environment has the following characteristics that might be important: Port range is 0 for TCP_NIO2 and TCPPING TCP_NIO2 uses the external IP address as the bind address "use_ip_addrs" is true The highest IP address node is not included in the initial hosts for TCPPING. (This does not appear to be absolutely necessary, but it because of the the address caching behavior it is much more likely to create the conditions). After the cluster is established, have the highest IP address node continually leave and rejoin the cluster. My code creates a new JChannel object for each rejoin (closing the old one in order to leave the cluster), but I am pretty sure it will work if you reuse the JChannel. Eventually you will get an attempted rejoin where the highest IP address node sees itself as the only node in the cluster. The coordinator will send requests to it to merge, but they will never be received.

    Description

      When a node leaves a cluster and then later attempts to re-enter, a race condition can occur where the clustering fails to occur. Here is the sequence of events that seems to allow this to occur:

      1. The rejoining node must have a "higher" IP address than the current cluster coordinator.
      2. On the rejoin attempt, the coordinator sends a message to the rejoining node before the rejoining node sends to the coordinator using its prior address. I have seen this happen for two reasons:
        1. UNICAST3 is resending messages (which often happens with the final LEAVE_RSP from the prior cluster membership because it apparently does not get acked before the connection closes)
        2. TCPPING is sending a ping request to the cached prior address.
      3. The connection gets established. It will then be used by the rejoining node whenever communicating with the cluster coordinator.
      4. However, the cluster coordinator has this as the connection for the prior address. So the following happens whenever it wants to send a message to the rejoining node:
        1. It will attempt to create a new connection.
        2. The rejoining node will reject the connection as a redundant connection with its current connection taking precedence since it is coming from the same logical address as the "bad" connection.

      Since the messages needed to find and join the cluster or merge the two clusters are all unicast messages, the rejoining node will never get them and not be able to join until something happens that causes the initial connection to get closed.

      Attachments

        Activity

          People

            rhn-engineering-bban Bela Ban
            bjetal2003 Robert Mitchell (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: