Message Service

Messaging constitutes the fundamental transport of messages and infrastructure and classes designed to make certain message paradigms simple.

There are 3 underlying transports to the SAFplus7 messaging layer.

IOC messaging C API
- This is a thin layer on top of IOC messaging itself and can have multiple transports, TIPC and UDP are currently supported. Applications can create their own message server listening to any IOC port.
SAFplus messaging.
- All SAFplus components and user applications need an IOC message server to carry all SAFplus library communications. The IOC port involved is well-known for SAFplus services and is either well-known or dynamically assigned for user applications. Since this single port is handling multiple protocols (each SAFplus library speaks with its own protocol) messages are contained within a larger protocol that identifies the contained protocol. It is possible for user applications to register their own sub-protocol and receive notifications when messages arrive. In this manner, applications can take advantage of much of the SAFplus message infrastructure. (In SAFplus 6.1, this capability is called the "EO" -- in SAFplus7, the 6.1 "EO" will not be used).
SA-Forum Message Queues
- See the Service Availability Forum documentation

Synchronous and Asynchronous

Message receipt can occur either synchronously or asynchronously using the "Wakeable" feature of the SAFplus thread semaphore system.

Use Cases

Server Side

Simple message server
- Call a Wakeable.wake() whenever a message is received. This Wakeable can be a callback, a message queue, or any derived class. wake() returns ACCEPTED if the message has been fully processed and this implicitly hands the message buffer back to the message server. Otherwise, wake() returns DEFERRED and retains ownership of the message buffer, and if the message was delivered reliably the message is NOT yet marked as delivered. At some later point, the application will call msgServer.accept(msg*). This gives ownership of the message buffer back to the message server and tells the server to mark the message as delivered. Not threaded: call an API "process(enum { ONE or ALL or FOREVER})" to make the above happen.
SAFplus message server
- Register sub-protocols by passing a well-known ID (256 bytes). Every received message has a small header: version, sub-protocol id This ID is examined in every received message and the appropriate Wakeable is called (from an array of them). The same ACCEPTED/DEFERRED behavior applies as in the simple message server. Not threaded: call an API "process(enum { ONE or ALL or FOREVER})" to make the above happen.

   1     /** Handle a particular type of message
   2         @param type    A number from 0 to 255 indicating the message type
   3         @param handler Your handler function
   4         @param cookie  This pointer will be passed to you handler function
   5      */
   6     void RegisterHandler(ClWordT type,  MsgHandler handler, ClPtrT cookie);
   7 
   8     /** Remove the handler for particular type of message */
   9     void RemoveHandler(ClWordT type);

Threaded,Pooled, Queued SAFplus or Simple message server
- This is the "final" class describing the most common and powerful message server combination. It adds a thread pool (set max threads to 0 if you don't want them) and/or queue (set max queue size to 1 if you dont want a queue) to the above and the server does the call-backs in multiple threads.

There will be one Threaded, Pooled, Queued SAFplus Message Server automatically created for every SAFplus component. This is how SAFplus libraries talk to each other, similar to the "EO" in SAFplus 6.1. But note that the application can also register sub-protocols via the RegisterHandler API and therefore leverage a lot of SAFplus infrastructure.

   1 // This is the singleton message server; one per process.
   2 extern SAFplusMsgServer safplusMsgServer;

Client Side

Message client with multithreaded sync/async send/reply
- Client is capable of issuing multiple "sends" and then waiting for the multiple replies. For a particular reply, it figures out which send it pairs to and "wake"s that entity.

   1 // Multi-threaded synchronous
   2 
   3 for (int i=0;i<maxNodes;i++)  // Sending the same query to every node
   4   {
   5   threadCreate(doit,i);
   6   }
   7 
   8 void doit(int node)
   9   {
  10   msg* buffer = msgClient.getBuffer();
  11   buffer = <fill it up>;
  12   msg* packet = msgClient.sendReply(node_index_to_address(i),buffer,buffer_ownership_transferred);  // No wakeable passed so synchronous
  13   <handle the reply>
  14   }
  15 
  16 
  17 // Simultaneous Synchronous
  18 queue replyQueue;  // replyQueue is a Wakeable that implements wake to put the reply (the passed cookie) onto the queue.
  19 msg* buffer = msgClient.getBuffer();
  20 buffer = <fill it up>;
  21 for (int i=0;i<maxNodes;i++)  // Sending the same query to every node
  22   {
  23   msgClient.sendReply(node_index_to_address(i),buffer,no_ownership_transfer, replyQueue);
  24   }
  25 msgClient.release(buffer);
  26 
  27 while(!replyQueue.empty())
  28 {
  29   msg* reply = replyQueue.pop();
  30 }
  31 
  32 //async
  33 MsgHandler handleReply;  // handleReply is a Wakeable that handles the message inline.
  34 msg* buffer = msgClient.getBuffer();
  35 buffer = <fill it up>;
  36 for (int i=0;i<maxNodes;i++)  // Sending the same query to every node
  37   {
  38   msgClient.sendReply(node_index_to_address(i),buffer,no_ownership_transfer, handleReply);
  39   }
  40 msgClient.release(buffer);

For performance, it is important to use as few buffer copy operations as possible. This is why the API specifies whether the buffer's "ownership" is passed to the messaging layer or retained. It ownership is passed, the buffer can be used directly in the IOC layer. But I don't like this implementation, is there a cleaner way to communicate this information?

Classes and Objects

MsgServerI

This is and abstract class defining the interface of a message server. It contains a function to get the address of the server,

/** Send a message
- @param msgtype The destination message handler @param destination Address of the destination node/process @param buffer Your data @param length Your data length Raises the "Error" Exception if something goes wrong, or if the destination queue does not exist.
/
void SendMsg(ClIocAddressT destination, void* buffer, ClWordT length,ClWordT msgtype=0); /** Start the server */ void Start(); /** Stop message processing right away
- Messages waiting in the queue are not dropped
- /
void Stop(); /** Stop this server
- This function stops accepting new messages right away, but does not return until all enqueued messages have been processed, and all processing threads are stopped.
/ void Quiesce();
void RecvMsg

Message Transport Layer

The message transports are plugins defined by a message transport interface class and the standard SAFplus plugin architecture.

Tests

testTransport:
- This test implements a variety of functional tests to ensure that the message transport layer behaves correctly. Please run it with the --help flag to get the latest program options. In particular, you'll need to use the --xport flag to specify your message transport plugin shared library. The test communicates with itself so nothing else needs to be run. Success output looks like:

...
Wed Mar 18 23:05:30.244 2015 [testTransport.cxx:328] (.0.16434 : .TST.___:00000 : INFO) Test case completed [send/recv messages of every allowed length]. Subcases: [1335973] passed, [0] failed, [0] malfunction.

testMsgPerf:
- This test calculates the performance of a message transport plugin. It tests a variety of sizes and message bulking for 3 cases: loopback, inter-process, and inter-node communications. Use --help to discover the parameters, including how to make the test load your transport plugin. Next, run the "msgReflector --xport <your .so>" program on the test machine and another machine (so run 2 copies of this program, one local, on remote). Next, on the test machine run:

testMsgPerf -rnode <nodeId of the remote msgReflector>

If testMsgPerf hangs after printing an info message about the next group of tests, it can't communicate with msgReflector.

data is output in the following table:

same process: len [     1] Latency [0.011307 ms]
same process: len [    16] Latency [0.009429 ms]
same process: len [   100] Latency [0.009582 ms]
same process: len [  1000] Latency [0.009511 ms]
same process: len [ 10000] Latency [0.010997 ms]
same process: len [     1] stride [     1] Bandwidth [203682.58 msg/s,     1.63 MB/s]
same process: len [    16] stride [     1] Bandwidth [216197.52 msg/s,    27.67 MB/s]
same process: len [   100] stride [     1] Bandwidth [161807.06 msg/s,   129.45 MB/s]
same process: len [  1000] stride [     1] Bandwidth [187099.49 msg/s,  1496.80 MB/s]
same process: len [ 10000] stride [     1] Bandwidth [177999.29 msg/s, 14239.94 MB/s]
same process: len [ 50000] stride [     1] Bandwidth [73956.29 msg/s, 29582.52 MB/s]
same process: len [     1] stride [    10] Bandwidth [247036.49 msg/s,     1.98 MB/s]
same process: len [    16] stride [    10] Bandwidth [246532.52 msg/s,    31.56 MB/s]
same process: len [   100] stride [    10] Bandwidth [226515.96 msg/s,   181.21 MB/s]
same process: len [  1000] stride [    10] Bandwidth [218375.18 msg/s,  1747.00 MB/s]
same process: len [ 10000] stride [    10] Bandwidth [194067.36 msg/s, 15525.39 MB/s]
same process: len [ 50000] stride [    10] Bandwidth [72648.28 msg/s, 29059.31 MB/s]
same process: len [     1] stride [    50] Bandwidth [257805.18 msg/s,     2.06 MB/s]
same process: len [    16] stride [    50] Bandwidth [221818.02 msg/s,    28.39 MB/s]
same process: len [   100] stride [    50] Bandwidth [249569.24 msg/s,   199.66 MB/s]
same process: len [  1000] stride [    50] Bandwidth [234750.05 msg/s,  1878.00 MB/s]
same process: len [ 10000] stride [    50] Bandwidth [173288.30 msg/s, 13863.06 MB/s]
same process: len [ 50000] stride [    50] Bandwidth [80331.67 msg/s,  4530.92 MB/s]
same process: len [    32] stride [  1000] Bandwidth [254485.17 msg/s,    65.15 MB/s]

Message Reliable

Socket Reliable

Socket reliable is a thin layer on top of transport layer itself. This socket should meet the following criteria:

provide reliable delivery up to a maximum number of retransmissions.
provide in-order delivery.
be a message based.
have high performance.
support peer to peer connection.
support multiple connection
support fragmentation of larger messages

The programming interface provides MsgReliableSocket, MsgReliableSocketServer that extend the conventional C++ Socket's programming interface, MsgReliableSocketClient, ReliableFragment classes.

Detail Design

Client side

Sender split message into multiple reliable fragments and sends it out. Reliable fragments are stored on the unacknowledged queue until receiving ACK from Receiver.
When an ACK fragment is received. All fragments that fragment Id less than ACK Number are removed. Retransmission occurs as a result of receiving an NACK segment or the time-out of the Retransmission timer.

When an NACK fragment is received. The fragments specified in the message are removed from the unacknowledged sent queue. The fragments to be retransmitted are determined by examining the Ack Number and the last out of sequence ack number in the NACK segment. All fragments between but not including these two sequence numbers that are on the unacknowledged sent queue are retransmitted.
When a retransmission time-out occurs, all fragments on the unacknowledged sent queue are retransmitted.

Server side

There are 2 reliable fragment queues :

out-of-sequence queue
in-sequence queue.

When receiving a fragment, fragment is put into one of these queue based on fragment Id and the last Fragment Id of in-sequence queue. Fragment is deleted if duplicate. When one fragment is put into in-sequence queue, receiver will check the out-of sequence queue to move fragment to in-sequence queue.

Multiple connection

Server can receive message from multiple client in parallel . Each Receive contains multiple reliable socket to handle multiple connections. When client initiates a connection its sends a SYN segment which contains configuration parameter to server. Server will create a reliable socket to receive data from the server . This socket is put into socket client list. Each socket in socket client list contains a thread to receive fragments , combine all fragments and notify the server to read the message.

Peer to Peer

Server can send and receive message in parallel.

Piggyback acknowledgments

Whenever a receiver sends a data, null, or reset segment to the transmitter, the receiver includes the sequence number of the last in-sequence data.

API

Create message server

   MsgServer(uint_t port, uint_t maxPendingMsgs, uint_t maxHandlerThreads, Options flags=DEFAULT_OPTIONS, SocketType type = SOCK_DEFAULT);

To create reliable MsgServer, set SocketType to SOCK_RELIABLE

Test

testMsgServerReliable :

This test implements a variety of functional tests to ensure that the message reliable behaves correctly. Success output looks like:

...
Fri Mar 11 14:27:14.628 2016 [testMsgServerReliable.cxx:173] (.0.585 : .TST.---:00000 : INFO) (main): Test case completed [MSG-SVR-UNT.TC001: simple send/recv test]. Subcases: [8] passed, [0] failed, [0] malfunction.
Fri Mar 11 14:27:14.628 2016 [testMsgServerReliable.cxx:173] (.0.585 : .TST.___:00000 : INFO) Test case completed [MSG-SVR-UNT.TC001: simple send/recv test]. Subcases: [8] passed, [0] failed, [0] malfunction.
Fri Mar 11 14:27:14.628 2016 [clTest.cxx:74] (.0.585 : .TST.---:00000 : INFO) (clTestGroupFinalizeImpl): Test completed.  Cases:  [1] passed, [0] failed, [0] malfunction.

Message Shaping

The leaky bucket algorithm is a method of temporarily storing a variable number of message and organizing them into a set-rate output of message. The leaky bucket is used to implement message Shaping. Message Shaping can be used to control metered-bandwidth Internet connections to prevent going over the allotted bandwidth.

Site

General

Services

Objects

Wiki

Page

User