Introduction
Feature List
Groups
Applications can easily become members of one or many user-defined groups. Groups allow applications to discover peers for scalability or high availability functions. Groups automatically elect 2 specially-designated members: "active" and "standby" (the application determines what these designations actually mean -- if anything). Members can indicate their capability to become "active" or "standby" and can specify a "credential" value -- the highest "credential" wins, allowing application programmers to guide the election process (if desired). Election results can be permanent or can be superceded by the admittance of a higher-credential member into the cluster (implementing optional fail-back semantics).
Applications can look at the member list of any group -- even ones they do not belong to. Groups can therefore be used to discover service providers. For example, the "load balancer" application can find all "web server" applications running in the cluster. Applications can also send messages to the "group" using different sending modes: the message can be directed to the active entity, the standby entity, all entities (broadcast), or a "random" entity (round-robin load balancing).
Groups make it easy to implement complex scale-out or high availability strategies, with no special logic inside the application.
Checkpoint
The checkpoint entity forms the backbone of the coordination of state information between nodes in the cluster. Abstractly, it is an in-RAM "hash table", "dictionary", or "map" (there are many names for this concept) -- that is, a user writes and reads an arbitrary data "key" that maps to an arbitrary data "value". However a Checkpoint differs from the traditional "hash table" structure because it exists in all processes that are interested in it. But unlike cloud-based "distributed hash tables", a checkpoint is fully replicated to all nodes (that are interested in it). It is not a "distributed" dictionary where every node has partial data. This means that checkpoints are fully redundant and have very fast look-up.
Checkpoints are primarily used to replicate program state to redundant nodes. When a standby process becomes active, it can read the latest program state from the checkpoint (warm standby) or a standby process can opt to continually receive state change updates (hot standby). In the latter case, the standby can resume service more quickly because it does not need to update its internal state with checkpointed data when becoming active.
Messaging
SAFplus provides an efficient and high-performance messaging mechanism. The underlying transport is implemented via a plugin architecture allowing new transport protocols to be defined by the user. SAFplus provides out-of-the-box support for TIPC, UDP, TCP and SCTP protocols. Optional layers can be instantiated on top of these transports to provide reliability, traffic shaping, segmentation and reassembly, and bandwidth vs. latency performance optimization.
Remote Procedure Calls
SAFplus messaging is integrated with the Google Protobuf serialization system to provide a high performance endian-aware object-oriented remote procedure call facility (RPC). RPCs can be defined in either the Protobuf language or the YANG data modelling language (RFC6020).
Logging
Logs that are generated by applications are written to shared memory in an efficient, non-blocking manner. This means that logging's impact on the performance of your application is minimized and the last logs before an application crash are preserved even though the application may have crashed before the logs were flushed to disk. The SAFplus logging server (often embedded within the SAFplus AMF) reads logs from shared memory, filters them by "stream" and SYSLOG (RFC 5424) severity levels, and outputs them to any of:
- the file system (with max-size and log file rolling),
- the system syslog facility,
- any application (on any node) that has registered to receive that log stream.
A log "stream" is a cluster-wide, application defined portal for log messages. Logs originating anywhere in the cluster sent to a particular stream will be received by every subcriber of that stream. Log streams are configured via the SAFplus management interface, via a configuration XML file, or manually via management APIs in any application.
Logging makes it easy for applications to send logs to a variety of destinations, with no special logic inside the application.