Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-613

Access to NAKACK#rebroadcast_digest field needs a lock

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done
    • Icon: Minor Minor
    • 2.6
    • 2.6
    • None

      It seems that rebroadcast_digest field of NAKACK is accessed by multiple threads which step over each other leading to a consequences below.
      This is very hard to reproduce and I have seen it only a few times so far.

      -------------------------------------------------------
      GMS: address is 127.0.0.1:2528
      -------------------------------------------------------
      11141 [INFO][main] ConcurrentStateTransferTest: - Thread for channel 127.0.0.1:2528[A] started
      15250 [INFO][A] ConcurrentStateTransferTest: - channel.getState at A127.0.0.1:2518 returned false
      15297 [INFO][Multiplexer,udp,127.0.0.1:2520] ConcurrentStateTransferTest: - – [#A (127.0.0.1:2520)]: received 127.0.0.1:2518
      15297 [INFO][Multiplexer,udp,127.0.0.1:2524] ConcurrentStateTransferTest: - – [#A (127.0.0.1:2524)]: received 127.0.0.1:2518
      15297 [INFO][Multiplexer,udp,127.0.0.1:2518] ConcurrentStateTransferTest: - – [#A (127.0.0.1:2518)]: received 127.0.0.1:2518
      15297 [INFO][Multiplexer,udp,127.0.0.1:2528] ConcurrentStateTransferTest: - – [#A (127.0.0.1:2528)]: received 127.0.0.1:2518
      15328 [ERROR][OOB,udp,127.0.0.1:2520] UDP: - failed handling incoming message
      java.lang.NullPointerException
      at org.jgroups.protocols.pbcast.NAKACK.rebroadcastMessages(NAKACK.java:1036)
      at org.jgroups.protocols.pbcast.NAKACK.down(NAKACK.java:620)
      at org.jgroups.protocols.UNICAST.down(UNICAST.java:434)
      at org.jgroups.protocols.pbcast.STABLE.down(STABLE.java:317)
      at org.jgroups.protocols.pbcast.GMS.down(GMS.java:813)
      at org.jgroups.protocols.FC.down(FC.java:370)
      at org.jgroups.protocols.FRAG2.down(FRAG2.java:175)
      at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.down(STREAMING_STATE_TRANSFER.java:318)
      at org.jgroups.protocols.pbcast.FLUSH.handleFlushReconcile(FLUSH.java:455)
      at org.jgroups.protocols.pbcast.FLUSH.up(FLUSH.java:337)
      at org.jgroups.protocols.pbcast.STREAMING_STATE_TRANSFER.up(STREAMING_STATE_TRANSFER.java:258)
      at org.jgroups.protocols.FRAG2.up(FRAG2.java:205)
      at org.jgroups.protocols.FC.up(FC.java:408)
      at org.jgroups.protocols.pbcast.GMS.up(GMS.java:742)
      at org.jgroups.protocols.pbcast.STABLE.up(STABLE.java:234)
      at org.jgroups.protocols.UNICAST.up(UNICAST.java:264)
      at org.jgroups.protocols.pbcast.NAKACK.handleMessage(NAKACK.java:798)
      at org.jgroups.protocols.pbcast.NAKACK.up(NAKACK.java:662)
      at org.jgroups.protocols.VERIFY_SUSPECT.up(VERIFY_SUSPECT.java:167)
      at org.jgroups.protocols.FD.up(FD.java:322)
      at org.jgroups.protocols.FD_SOCK.up(FD_SOCK.java:308)
      at org.jgroups.protocols.MERGE2.up(MERGE2.java:145)
      at org.jgroups.protocols.Discovery.up(Discovery.java:250)
      at org.jgroups.protocols.TP$IncomingPacket.handleMyMessage(TP.java:1512)
      at org.jgroups.protocols.TP$IncomingPacket.run(TP.java:1461)
      at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:650)
      at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
      at java.lang.Thread.run(Thread.java:595)

            rhn-engineering-bban Bela Ban
            vblagoje Vladimir Blagojevic (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved: