Details
-
Bug
-
Resolution: Done
-
Major
-
3.2.6
-
None
Description
After the UNICAST2.STABLE message is received, sent_msgs.purge() is called for the sender's entry and Table's low == hd == hr. Then the SenderEntry.getFirstMessage() returns null and that causes the NPE below.
There is another race condition when this may happen: When the message is sent down the stack for a new destination a SenderEntry is inserted into send_table. This entry, however, contains no messages yet, and, therefore, as the RetransmitTask in triggerXmit retrieves the first message, it gets null and throws NPE.
08:19:05,716 ERROR [org.jgroups.util.TimeScheduler2] (Timer-2,edg-perf09-2672) failed running task UNICAST2: RetransmitTask (interval=500 ms) java.lang.NullPointerException at org.jgroups.protocols.UNICAST2.triggerXmit(UNICAST2.java:1326) at org.jgroups.protocols.UNICAST2$RetransmitTask.run(UNICAST2.java:1294) at org.jgroups.util.TimeScheduler2$RecurringTask.run(TimeScheduler2.java:591) at org.jgroups.util.TimeScheduler2$MyTask.run(TimeScheduler2.java:523) at org.jgroups.util.TimeScheduler2$Entry.execute(TimeScheduler2.java:428) at org.jgroups.util.TimeScheduler2$1.run(TimeScheduler2.java:286) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662)
Theoretically there is another race condition in the SenderEntry.getFirstMessage() itself - the low value may change between the getLow and get calls, but I am not sure when this may happen.