Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-680

receive_on_all_interfaces requires every NIC to be configured

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Major Major
    • 2.7
    • 2.6.1
    • None
    • Windows Server 2003

    • Workaround Exists
    • Hide

      Make sure outside of the program that uses JGroups that every NIC is either working, with an IP address configured, and with the network cable plugged in, or is disabled.

      Show
      Make sure outside of the program that uses JGroups that every NIC is either working, with an IP address configured, and with the network cable plugged in, or is disabled.

      In UDP with receive_on_all_interfaces="true", if any interface is not configured – say, the network cable is unplugged – then JGroups will refuse to start by throwing an exception (IP addresses changed to protect the innocent):

      Caused by: org.jgroups.ChannelException: failed to start protocol stack
      at org.jgroups.JChannel.startStack(JChannel.java:1445)
      at org.jgroups.JChannel.connect(JChannel.java:356)
      at org.jgroups.blocks.NotificationBus.start(NotificationBus.java:126)
      at the application making use of JGroups
      Caused by: java.lang.Exception: problem creating sockets (bind_addr=xxxxxxxx/11.22.33.44, mcast_addr=224.1.2.3:4444)
      at org.jgroups.protocols.UDP.start(UDP.java:363)
      at org.jgroups.stack.Configurator.startProtocolStack(Configurator.java:75)
      at org.jgroups.stack.ProtocolStack.startStack(ProtocolStack.java:301)
      at org.jgroups.JChannel.startStack(JChannel.java:1442)
      ... 9 more
      Caused by: java.net.SocketException: bad argument for IP_MULTICAST_IF2: No IP addresses bound to interface
      at java.net.PlainDatagramSocketImpl.join(Ljava.net.InetAddress;Ljava.net.NetworkInterface;)V(Native Method)
      at java.net.PlainDatagramSocketImpl.joinGroup(PlainDatagramSocketImpl.java:196)
      at java.net.MulticastSocket.joinGroup(MulticastSocket.java:357)
      at org.jgroups.protocols.UDP.bindToInterfaces(UDP.java:525)
      at org.jgroups.protocols.UDP.createSockets(UDP.java:470)
      at org.jgroups.protocols.UDP.start(UDP.java:359)
      ... 12 more

      Simply taking every NIC whose status is "network cable unplugged" and disabling that interface allows JGroups to start.

      This may be a feature request and not a bug report, but this seems unnecessarily strict. With the setting of receive_on_all_interfaces="true", if at least one interface comes up, then that should be enough for JGroups to function.

      In the application where I encountered this, I have both receive_on_all_interfaces="true" and send_on_all_interfaces="false". The reason is that if send_on_all_interfaces is true but JGroups fails to be able to send a message, there is no notification of this failure. (That is, if the message cannot be sent on any interface at all.) By sending on one interface by receiving on all, I appear to get the best of all worlds where a cluster of machines with multiple NICs should be able to communicate with one another no matter what the binding order of the NICs and no matter what order Java presents the interfaces.

      Except that receive_on_all_interfaces="true" requires EVERY NIC that is not disabled to be perfectly functioning or you cannot do anything at all. Which means unplug one NIC cable and now the application cannot function at all, despite the fact that there's another network on which the clustered machines are all available.

      Preferred behavior: If ** at least one ** interface successfully opens a socket when receive_on_all_interfaces="true", then succeed. If ** all ** interfaces fail as shown above, then throw the exception shown above.

              vblagoje Vladimir Blagojevic (Inactive)
              ekuns_jira Edward Kuns (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: