Running the Availability Management Framework (AMF) Example
The AMF provides redundancy, failure monitoring and restart for applications. To implement this, the AMF requires a description of the cluster, which is essentially a description of all the programs that should be run, whether they "understand" SAF communications, and their redundancy relationships. This is called the "cluster model" or just "model".
An example model and simple redundant "Hello World" application is included with SAFplus and is located in the source code under "examples/eval/basic" (and on download.openclovis.com). This model specifies applications that are "SAF-aware" -- that is, the applications can communicate with the AMF to receive work assignments. Another example model is also provided at "examples/eval/exampleNonSafApp" that handles "non-SAF-aware" applications. Non-SAF-aware applications do not communicate with the AMF -- they may be legacy apps, standard Linux services (e.g Apache), or applications that you do not have access to the source code.
The "Basic" Example: Compiling and Running
This example presumes that you have already installed SAFplus, or built it from source.
First, Download and detar the example from http://download.openclovis.com/files/examples/SAFplusBasicEval.tgz or cd to <safplus>/examples/eval/basic if you are using SAFplus source from Github. Next build:
make V=1
The program "basicApp" will be created. If you installed SAFplus (via a package manager, for example), your application binary will be created in this directory, but if you are using SAFplus uninstalled from source, your binaries will be located in the SAFplus target directory (<safplus>/target/<architecture>/bin). Also, it is possible to run the application directly out of this build tree, but this tutorial will show the full packaging technique. So next build your executables and SAFplus into a single package:
make image
This executes commands that use the safplus_packager program to produce a tarball. Congratulations! You have successfully compiled and packaged a SAFplus application. In the next section, you will install and run it.
The "Basic" Example: Running
Copy this tarball to your (binary compatible) target computer and detar it. In this eval, we will use the build machine as the target, so simply detar it from its current location into a directory off of our home.
cd ~ tar xvfz basic/evalbasic.tgz
Next set up paths and environment variables, so that the SAFplus commands and applications can find the needed shared libraries:
cd evalBasic/bin source setup
You should open this script and familiarize yourself with what it is doing since you will need to modify it for your own applications. Essentially it:
- Sets up paths to SAFplus libraries (if needed)
- Configures Linux to allocate more buffers for networking (optional but essential for applications that heavily use networking)
- Sets up SAFplus environment variables, most importantly:
- ASP_NODENAME: selects this node's name, which corresponds to a node definition in the XML model file.
- SAFPLUS_BACKPLANE_INTERFACE: selects the intra-cluster networking interface.
- SAFPLUS_MGT_DB_PLUGIN: selects the underlying database.
- Sets up the cloud node identification table (only needed if using "cloud" based transports).
This command also sets up the essential SAFplus environment variables that control your backplane network interface, your chosen backplane messaging plugin, and your chosen database plugin. Please read this file and familiarize yourself with the settings.
Note that this setup script also attempts to increase kernel network buffer space. However these commands will only succeed if root. Increasing the kernel networking buffers is optional for this simple sample, but is highly recommended for any network-centric application.
Next, we need to load a cluster model. The cluster model defines what applications are running on what nodes in the cluster and their high availability relationship. Two example models are provided in this sample:
- SAFplusAmf1Node1SG1Comp.xml: Run redundant copies of the app on a single node
- SAFplusAmf2Node1SG1Comp.xml: Run on two nodes
Please choose which option you want (this eval will use the single node version), cd to the bin directory, and "install" the model into the SAFplus database:
cd evalBasic/bin ./safplus_db -x SAFplusAmf1Node1SG1Comp.xml safplusAmf
If you get any "missing file" errors in this step, you did not set up the environment properly.
This command reads the model XML in and writes it to a the safplusAmf database. Since the database back-end is a plugin, you might have different results in this step. For example, if SQLite is your chosen database, a file called safplusAmf.db will be created in the current directory. However, by default this example uses the SAFplus checkpointing service as its database. We can see the table by running:
./safplus_name
You should see an entry named "safplus.mgt.safplusAmf.db". The AMF will read this database "file" to access the cluster model.
Note: To clean up all safplus shared memory segments (including this checkpoint table), run ./safplus_cleanup.
Finally, run the SAFplus AMF to start up the model:
./safplus_amf
Or you may want to start it up as a daemon:
nohup ./safplus_amf &
After a few seconds you will see 2 instances of the program "basicApp" running:
ps -efwww | grep basic
user 7092 7071 0 21:53 pts/2 00:00:00 ./basicApp c1 user 7093 7071 0 21:53 pts/2 00:00:00 ./basicApp c0
And depending on your logging level, you may see these applications outputting their status onto the console:
Tue Aug 9 21:53:23.355 2016 [main.cxx:371] (node0.7093.7093 : c0.APP.MAIN:00019 : INFO) Basic HA app: Standby. Tue Aug 9 21:53:23.362 2016 [main.cxx:370] (node0.7092.7092 : c1.APP.MAIN:00019 : INFO) Basic HA app: Active. Hello World!
Go ahead and kill one of these processes to see it automatically restart and watch the logs to see it the active/standby roles fail over.
This concludes this basic evaluation. Congratulations you've built and deployed your first highly available application!
Troubleshooting
- SAFplus AMF never starts any applications. It just says:
Fri Apr 8 14:07:07.388 2016 [nPlusmAmfPolicy.cxx:510] (.0.7671 : AMF.N+M.AUDIT:00996 : INFO) Service Instance [si] should be fully assigned but is [unassigned]. Current active assignments [0], targeting [1] Fri Apr 8 14:07:07.388 2016 [nPlusmAmfPolicy.cxx:527] (.0.7671 : AMF.N+M.AUDIT:00997 : INFO) Service Instance [si] cannot be assigned 0th active. No available service units. Fri Apr 8 14:07:07.388 2016 [customAmfPolicy.cxx:36] (.0.7671 : AMF.POL.CUSTOM:00998 : INFO) Active audit Fri Apr 8 14:07:07.388 2016 [customAmfPolicy.cxx:47] (.0.7671 : AMF.CUSTOM.AUDIT:00999 : INFO) Auditing service group sg0
Did you forget to set a node name? The node names in the model do not match the nodes running.
- SAFplus AMF never starts any applications, but a lot of error.#### files are created:
SAFplus AMF cannot spawn your application. It is possible that the binary does not exist, is not executable, or is not of this machine's architecture. "cat" the error.#### file to see the specific problem.