Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-2334

TCPPing: resolving hosts with InetAddress.getAllByName() to get all addresses

    XMLWordPrintable

Details

    • Feature Request
    • Resolution: Done
    • Major
    • 4.1.0
    • 4.0.19
    • None
    • 0
    • 0% 0%

    Description

      We want to discuss a change in the TCPPing module.

      We plan to run wildfly instances in a container orchestration system. In the system which we want to use, the node discovery over multicast is not working.
      The other solution is to use TCPPing with initial_hosts set. But now we have to solve the following problems:

      • the initial_hosts property is not very dynamic
      • the ip addresses will/can change if a container is restarted
      • the host names are dynamically generated

      At this point it seems the node discovery can not be done with TCPPing, at least not in an easy way.
      The main problem: How to find out all running nodes for a server group?

      Now we investigate our orchestration system and find a solution to solve the problem. Our orchestration system (and we think others will have this too) has an internal DNS service.
      Over this service all containers for a dns name can be resolved with a nslookup request.
      Example:
      We have a scalable wildfly service. We name it "wildfly-server". If a container under this service is started then the container gets a host name like "wildfly-server-0" and a dynamic ip address.
      After starting one or more container we can do a nslookup with the service name:
      >nslookup wildfly-server
      Name: wildfly-server
      Address 1: 10.42.2.139 wildfly-server-1.wildfly-server
      Address 2: 10.42.1.198 wildfly-server-0.wildfly-server
      Address 3: 10.42.0.161 wildfly-server-2.wildfly-server

      The service name has multiple A-Records registered. If an instance is started or stopped then the DNS Records are updated. Now we tried to use this service name for the initial_hosts property.
      initial_hosts=wildfly-server[7600]

      Sometimes it worked and sometimes it doesn't. The reason was that only the first InetAddress entry was used in the org.jgroups.util.Util class (method parseCommaDelimitedHosts). After we changed it a bit (see https://github.com/Sternwald-Systems/JGroups/commit/db0b899f9c67348a0cb073783aad34c2ab3bfb40 ) it worked as expected. What we do is to call InetAddress.getAllByName(host) and loop over the result array, instead of just using the first array element.

      There is only one limitation if the domain mode with more than one server group is used. Here the same port offset for all servers of one server group must be set.

      Conclusion
      There are different orchestration systems available on the market. The worst case will be to write a custom discovery service for jgroups for each of them.
      For instance for the kubernetes system there already exists such a service (jgroups-kubernetes).
      But if an orchestration system already has an internal DNS service to resolve a dns name to get all running containers TCPPing (with our changes) could be used out of the box.

      Additionally there is a second method in the org.jgroups.util.Util class called parseCommaDelimitedHosts2 which does nearly the same but for the TCPGossip protocol.
      We think it would make sense to change this method too, otherwise there are different behaviors. I you don't mind, we would apply the changes to this method too, before creating a pull request.

      It is also important to document this well so other people can find this information if they have the same problem.

      Attachments

        Activity

          People

            rhn-engineering-bban Bela Ban
            fthiel_jira Falco Thiel (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: