If there is an empty or bad file in the directory (due to some reason - maybe, one of nodes had crashed during file write), you will get the following exception:
java.lang.NullPointerException
at org.jgroups.protocols.FILE_PING.handleView(FILE_PING.java:146)
at org.jgroups.protocols.FILE_PING.down(FILE_PING.java:116)
at org.jgroups.protocols.MERGE2.down(MERGE2.java:155)
at org.jgroups.protocols.FD_SOCK.down(FD_SOCK.java:332)
at org.jgroups.protocols.FD.down(FD.java:276)
at org.jgroups.protocols.VERIFY_SUSPECT.down(VERIFY_SUSPECT.java:69)
at org.jgroups.protocols.BARRIER.down(BARRIER.java:91)
at org.jgroups.protocols.pbcast.NAKACK.down(NAKACK.java:639)
at org.jgroups.protocols.UNICAST.down(UNICAST.java:444)
at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:297)
at org.jgroups.protocols.pbcast.GMS.installView(GMS.java:596)
at org.jgroups.protocols.pbcast.GMS.installView(GMS.java:516)
at org.jgroups.protocols.pbcast.ClientGmsImpl.becomeSingletonMember(ClientGmsImpl.java:344)
at org.jgroups.protocols.pbcast.ClientGmsImpl.joinInternal(ClientGmsImpl.java:93)
at org.jgroups.protocols.pbcast.ClientGmsImpl.join(ClientGmsImpl.java:38)
at org.jgroups.protocols.pbcast.GMS.down(GMS.java:922)
at org.jgroups.protocols.FC.down(FC.java:431)
at org.jgroups.protocols.FRAG2.down(FRAG2.java:154)
at org.jgroups.stack.ProtocolStack.down(ProtocolStack.java:894)
at org.jgroups.JChannel.downcall(JChannel.java:1649)
at org.jgroups.JChannel.connect(JChannel.java:420)
... 136 more
This occurs at EVERY node, after that the whole communication is terminated. I even did not find any jgroups threads after that.
Also, you can not connect new nodes after that - JChannel.connect() crashes for the same reason.
The problem was reproduced today in our production system.
Workaround:
I would propose the following 2 fixes:
1) when reading files, do not add null/empty/bad entries
2) [for better reliability] surround the whole FILE_PING.handleView() with try/catch (maybe, for any Discovery protocol?) - even if Discovery fails, all other parts should NOT fail.