Uploaded image for project: 'JGroups'
  1. JGroups
  2. JGRP-2143

TP: use only one thread per member to pass up regular messages

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Done
    • Icon: Major Major
    • 4.0
    • None

      This applies only to regular messages; OOB and internal messages are processed by passing them to the thread pool directly when they're received.

      The processing of a message received from B is as follows:

      • A regular message (or message batch) is assigned a thread from the thread pool and passed up to the reliability protocol, e.g. NAKACK2 or UNICAST3.
      • There it is added to the table for B.
      • The thread sees if another thread is already delivering messages from B to the application. If not, it grabs as many consecutive (ordered) messages from the table as it can and delivers them to the application. Otherwise, it returns and can be assigned other tasks.

      The problem here is that more than one thread may be passing up messages from a given sender B; only at the NAKACK2 or UNICAST3 level will a single thread be selected to deliver the messages to the application.

      This causes higher thread pool usage than required, with all of its drawbacks, e.g. more context switching, higher contention on adding messages to the table for B, and possibly exhaustion of the thread pool.

      An example of where service is denied or delayed:

      • We have a cluster of {A,B,C,D}
      • A receives 10 messages from B, 4 from C and 1 from D
      • The thread pool's max size is 20
      • The 10 messages from B are processed; all 10 threads add their messages to the table, but only 1 delivers them to the application and the other 9 return to the pool
      • 4 messages from C are added to C's table, 1 thread delivers them and 3 return
      • The 1 message from D is added to D's table and the same thread is used to deliver the message up the stack to the application

      So while we receive 15 messages, effectively only 3 threads are needed to deliver them to the application: as these are regular messages, they need to be delivered in sender order.

      The 9 threads which process messages from B are only adding them to B's table and then return immediately. This causes increased context switching, plus more contention on B's table (which is synchronized), and possibly exhaustion of the thread pool. For example, if the pool's max size was only 10, then processing the first 10 messages from B would exhaust the table, and the other messages from C and D would be processed in newly spawned threads.

      SOLUTION

      • (Only applicable to regular messages)
      • When a message (or batch) from sender P is received, we check if another thread is already passing up messages from B. If not, we pass the message up by grabbing a thread from the thread pool. This will add the message to P's table and deliver as many messages (from from the table) as possible to the application.
      • If there's currently a thread delivering P's message, we simply add the message (or batch) to a queue for P and return.
      • When the delivery thread returns, it checks the queue for P and delivers all queued messages, or returns if the queue is empty.
      • (The queue is actually a MessageBatch, and new messages are simply appended to it. On delivery, the batch is cleared)

      The effects of this for regular messages are

      • Fewer threads: the thread pool only has a max of <cluster-members> threads for regular messages where <cluster-members> is the number of members in the cluster from whom we are concurrently receiving messages. E.g. for a cluster {A,B,C,D}, if we're receiving messages at the same time from all members, then the max size is 4.
        • Of course, OOB and internal messages, plus timer tasks will add to this number.
      • Less contention on the table for a given member: instead of 10 threads all adding their messages to B's table (contention on the table lock) and then CASing a boolean, only 1 thread ever adds and removes messages to/from the table. This means uncontended (= fast) lock acquisition for regular messages (of course, if we use OOB messages, then we do have contention).
      • Appending to a batch is much faster then adding to a table
      • The downside is that we're storing messages actually twice: once in the batch for P and once in P's table. But these are arrays of pointers, so not a lot of memory required.

      Example: the 10 threads for messages from B above, will create a batch of 9 messages in B's queue and grab 1 thread from the pool to deliver its message. When the thread is done, it will grab the message batch of 9 and also add it to the table and deliver it.

      This is similar to the bulkhead pattern [1].

      [1] http://stackoverflow.com/questions/30391809/what-is-bulkhead-pattern-used-by-hystrix

              rhn-engineering-bban Bela Ban
              rhn-engineering-bban Bela Ban
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: