When TCP.use_send_queues is enabled, sending a message puts it in a queue from which a sender thread dequeues it and sends it.
This was a bad idea from the start, because
- We create 1 additional (sender) thread per peer. That's 999 threads per member if we have a cluster of 1000. Besides, more threads leads to increased context switching
- The queue only tapers over blocking on a TCP write, as it will fill up if the TCP write blocks. So the problem of a bloking write is only moved to a different part of the system, not eliminated.
- We have to make a copy of every buffer to be sent as buffers are reused -> unneeded memory allocation