Change payload in Message from byte arrays to a Payload interface which can have multiple implementations.
Currently, having to pass a byte array to a message leads to unnecessary copying:
- When an application has a ref to an NIO (direct) ByteBuffer, the bytes in the byte buffer have to be copied into a byte array and then set in the message
- When the application sends around byte arrays, but also wants to add some additional metadata, e.g. type (1000-byte requests/responses), it needs to create a new byte array of (say) 1001 bytes and copy the data (1000 bytes) plus the request type (1 byte) into the new copy. Example: MPerf and UPerf
- When an object has to be sent (e.g. in Infinispan), the object has to be marshalled into a byte array (first allocation) and then added to the message. With the suggested ObjectPayload (below), marshalling of the object would occur late, and it would be marshalled directly into the output stream of the bundler, eliminating the byte array allocation made by the application.
Instead of copying, the application creates an instance of Payload and sets the payload in Message. The Payload is then passed all the way down into the transport where it is marshalled and sent. There can be a number of payload implementations, e.g.
- ByteArrayPayload: wraps a byte array with an offset and length
- NioDirectPayload: wraps an NIO direct ByteBuffer
- NioHeapPayload: wraps an NIO heap-based ByteBuffer
- CompositePayload: wraps multiple Buffers. E.g. type (1 byte) and data (1000 bytes) as described above
- IntPayload: a single integer
- ObjectPayload: has an Object and a ClassLoader (for reading), plus a Marshaller which know how to marshal the object, this allows for objects to be passed in payloads and they're only marshalled at the end (transport).
- PartialPayload: a ref to a Payload, with an offset and length
- InputStreamPayload: has a ref to an input stream and copies data from input- to output stream when marshalling
The Payload interface has methods:
- getInput(): this provides a DataInput stream for reading from the underlying payload
and possibly also
- acquire() and
- release() (for ref-counting)
Each payload impl has an ID and it should be possible to register new impls. A PayloadFactory maintains a mapping between IDs and impl classes.
When marshalling a Payload, the ID is written first, followed by the payload's writeTo() method. When reading payloads, the PayloadFactory is used to create instances from IDs.
When fragmenting a buffer, the fragments are instances of PartialPayload which maintains an offset and length over an underlying payload. When marshalling a PartialPayload, only the part between offset and offset+length is written to the output stream.
For fragmentation, method size() is crucial to determine whether a payload needs to be fragmented, or not. If, for example, a payload (e.g. an ObjectPayload) cannot determine the correct size, it may return -1. This leads to the ObjectPayload getting marshalled right away and getting wrapped into a ByteArrayPayload. So if size() cannot be determined, we have exactly the same behavior as what's currently done.
If we implement ref-counting, then payloads can be reused as soon as the ref-count is 0. For example, when sending a message, the payload's ref-count could be incremented by the app calling acquire(). (Assuming the message is a unicast message), UNICAST3 would increment the count to 2. This is needed because UNICAST3 might have to retransmit the message if it was lost on the network, and meanwhile the payload cannot be reused (changed). The app calls release() when the JChannel.send() call returns, but the payload cannot be reused until UNICAST3 calls release() as well. This will happen when an ACK for the given message has been received.
When a request is received, the buffer is created from the bytes received on the network, based on the ID. This should be done by asking a PayloadFactory component for a new buffer. A naive implementation might create a new buffer every time, but a more sophisticated one might use a pool of payloads.
The PayloadFactory instance could be replaced by one's own implementation; this allows for an application to control the lifecycle of payloads: thus the creation of buffers by the application and of payloads received over the network can be controlled by the same payload management impl.
When sending a CompositePayload of a 500 byte ByteArrayPayload and a 1000 byte NioDirectPayload, would we want to also get the same CompositePayload consisting of 2 payloads on the receiver side, or would we want to combine the 2 payloads into one and make the 2 payloads refer to the same combined byte array (or NIO buffer)? Should this be made configurable?
If ObjectPayload cannot determine the size of the serialized data, it should return -1. This means that Message.setPayload(ObjectPayload) would right away serialize ObjectPayload into ByteArrayPayload.
This means we do have the byte array creation (same as now), but for object payloads which do implement size() correctly, we could still do late serialization.
FRAG3 could decorate ObjectPayload with a fragmentation payload, which generates fragments on serialization and sends them down the stack.
- Since this issue includes API changes, the version will be 5.0