Loading...

XML

Word

Printable

Type: Feature Request
Resolution: Done
Priority: Major
Fix Version/s: 5.4
Affects Version/s: None
Labels:
None

Blocked:
False
Blocked Reason:
None
Ready:
False
Intelligence Requested:
Market:

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

Multicasts are flow-controlled, however, retransmissions are not. This is because MFC is above NAKACK2. This causes problems when we have a lot of message drops: 'retransmission storms' might overwhelm the switch / receiver queues, and cause more traffic than the original messages, leading to even more drops.

Placing MFC below NAKACK2 also leads to problems:

When both original and retransmitted messages block on 0 credits in MFC, the thread pool will soon be exhausted with retransmission requests.
If we tag retransmitted messages as DONT_BLOCK, then retransmitted messages will get dropped by MFC on 0 credits. This 'favoring' of original messages over retransmissions leads to ever widening xmit windows on the receivers, eventually causing memory exhaustion.

The xmit window (implemented by Table) can widen because it's not fixed size, but expands and shrinks dynamically.

We therefore need a fixed-size xmit window, which blocks senders when adding messages if there's not enough space. Enter NAKACK4:

NAKACK4

NAKACK4 has fixed-sized sender and receiver windows (RingBufferSeqno). Conceptually, every member has 1 sender window, plus 1 receiver window per cluster member.

A window has space for a max number of messages (capacity), with a low and high index:

On the sender, low = highest acked and high = highest sent
On the receiver, low = highest delivered and high = highest received

The sender increments a seqno and adds it to the sender window. If there's not enough space, the send will block until there is.

When a receiver receives a message, it adds the message; dropping it if the seqno is out of range, then delivers as many messages as possible (without a gap), increasing the low index. Then it sends an ack back to the sender.

When the sender has acks from all receivers (cluster members), it computes the minimum and advances the low index. This unblocks blocked senders, so they may now be able to add their messages to the send window and send them.

The receiver sends acks either after a number of messages have been received, or periodically (xmit_interval). It also periodically sends retransmission requests if it detects missing messages.

Because the number of messages in transit cannot be higher than the number of senders times the window capacity, we have a natural flow control over both original and retransmitted messages. For example:

The window capacity of 2000 messages
If we have a cluster {A,B,C,D}, and only A is sending, the max number of messages in transit (and stored at every member) is 2000. If all members are sending, then it is 8000.

We therefore don't need any flow control protocol (MFC) anymore.

We also won't need STABLE any longer, because message stability is provided via ACKs from receivers to senders.

Main differences to NAKACK2

Fixed-size retransmit window
ACKs sent by receivers to senders -> unblocking of senders and message stability

New protocols NAKACK3 and NAKACK4

Because we don't want to introduce any incompatibility by modifying NAKACK2 (extending a new base class ReliableMulticast and/or bugs, NAKACK2 will be left unchanged. Instead:

NAKACK2 is copied to NAKACK3
NAKACK3 extends ReliableMulticast, and most of its functionality will be moved to that new parent class
NAKACK4 also extends ReliableMulticast

Caveats

Because this is a design based on acks, not nacks (NAKACK2), it will not scale to hundreds of cluster members. However, note that the number of acks sent can be reduced, e.g. by only sending them every xmit_interval, by sending them on every Nth message, or by (possibly) piggybacking them on outgoing messages.

Misc

RingBufferSeqno could also be used by UNICAST3 instead of Table. However, because most unicast-based applications use TCP (which flow controls both original and retransmitted messages) rather than UDP, the problem is not pressing. We can look at this later, possibly in a separate JIRA issue.
The design is described in ./doc/design/NAKACK4.txt

[1] ~~JGRP-2140~~

impacts account

JGRP-2817 Asynchronous messages with slow receiver runs out of memory

Resolved

Assignee:: Bela Ban

Reporter:: Bela Ban

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2024/04/08 7:24 AM

Updated:: 2024/08/27 7:33 AM

Resolved:: 2024/08/27 7:33 AM

Details

Description

NAKACK4

Main differences to NAKACK2

New protocols NAKACK3 and NAKACK4

Caveats

Misc

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates