-
Feature Request
-
Resolution: Done
-
Major
-
None
-
None
The current bundlers queue all messages and when the total size of all messages for all destinations would exceed max_bundle_size, message batches for each destination are sent.
This negatively affects latency-sensitive applications, e.g. when we have a queue such as this: A B B C B B D B B, then the message for A has to wait until either the queue is full (max_bundle_size exceeded), or no more messages are received (and then we send the batches anyway).
The goal is to write a new bundler which keeps a count for each destination and sends batches to different destinations sooner. Also introduce a counter num_flips (find a better name!), which determines when a message batch is to be sent.
This counter is decremented when a message to be sent has a destination that's different from the previous destination. When the counter is 0, we send the batch to the previous destination(s).
We have a main queue, into which the senders write, and a runner thread (same as run() in TransferQueueBundler), which continually removes messages from the main queue and inserts them into queues for each destination.
So 1 main queue and 1 queue for each destination.
Example:
- num_flips is 2
- A message for A is sent, added to the main queue and removed by the runner. It is queued in A's queue
- Another message for A is sent. Also queued (A's queue: A A)
- A message to B is sent: A's num_flips is now 1. A's queue is A A, B's queue is B
- Another message to A is sent. This resets A's num_flips to 2, B's num_flips is now 1
- 2 messages to C are sent. This causes num_flips for A and B to be 0, so the batches to A (with 3 msgs) and B (1 msg) are also sent
- No more messages are received, so the batch to C is also sent
The value of num_flips should be computed as the rolling (weighted) average of the number of adjacent messages to the same destination. It is maintained for each destination separately (probably in the queue for that destination).
Misc
- Should the sending of batches be delegated to a thread pool?
- Should the senders add their messages directly to the destination queues instead of the main queue? That would result in less contention on the main queue, but it would also require 1 thread per destination queue, which creates too many threads...