Running the Availability Management Framework (AMF) Example

The AMF provides redundancy, failure monitoring and restart for applications. To implement this, the AMF requires a description of the cluster, which is essentially a description of all the programs that should be run, whether they "understand" SAF communications, and their redundancy relationships. This is called the "cluster model" or just "model".

An example model and simple redundant "Hello World" application is included with SAFplus and is located in the source code under "examples/eval/helloWorld". This model specifies applications that are "SAF-aware" -- that is, the applications can communicate with the AMF to receive work assignments. Another example model is also provided at "examples/eval/exampleNonSafApp" that handles "non-SAF-aware" applications. Non-SAF-aware applications do not communicate with the AMF -- they may be legacy apps, standard Linux services (e.g Apache), or applications that you do not have access to the source code.

The "helloWorld" Example: Compiling and Running

This example presumes that you have already installed SAFplus, or built it from source.

First, cd to <safplus_dir>/examples/eval/helloWorld/src (you are using SAFplus source from Github). Next build:

make V=1

The program "helloComp1" will be created. Your binaries will be located in the helloWorld example target directory (<safplus_dir>/examples/eval/helloWorld/target/bin). Also, it is possible to run the application directly out of this build tree, but this tutorial will show the full packaging technique. So next build your executables and SAFplus into a single package:

make images

This executes commands that use the safplus_packager program to produce a tarball. The image tarballs are generated in <safplus_dir>/examples/eval/helloWorld/images. Congratulations! You have successfully compiled and packaged a SAFplus application. In the next section, you will install and run it.

The "helloWorld" Example: Running

Copy this tarball to your (binary compatible) target computer and detar it. In this eval, we will use the build machine as the target, so simply detar it from its current location into a directory off of our home.

cd ~
tar xvfz HelloNode1.tgz

Next set up paths and environment variables, so that the SAFplus commands and applications can find the needed shared libraries:

cd HelloNode1/bin
source setup

You should open this script and familiarize yourself with what it is doing since you will need to modify it for your own applications. Essentially it:

This command also sets up the essential SAFplus environment variables that control your backplane network interface, your chosen backplane messaging plugin, and your chosen database plugin. Please read this file and familiarize yourself with the settings.

Note that this setup script also attempts to increase kernel network buffer space. However these commands will only succeed if root. Increasing the kernel networking buffers is optional for this simple sample, but is highly recommended for any network-centric application.

Next, start the amf:

cd HelloNode1
./etc/init.d/safplus start --load-cluster-model

If you get any "missing file" errors in this step, you did not set up the environment properly.

This command with option --load-cluster-model reads the model XML in and writes it to a the safplusAmf database, then start the AMF. For the first system controller node starts, option --load-cluster-model is required, for other nodes which start after the first node, it's not required. Since the database back-end is a plugin, you might have different results in this step. For example, if SQLite is your chosen database, a file called safplusAmf.db will be created in the current directory. However, by default this example uses the SAFplus checkpointing service as its database. We can see the table by running:

./safplus_name

You should see an entry named "safplus.mgt.safplusAmf.db". The AMF will read this database "file" to access the cluster model.

Note: To clean up all safplus shared memory segments (including this checkpoint table), run ./safplus_cleanup.

After a few seconds you will see 2 instances of the program "helloComp1" running:

ps -efwww | grep hello

root       6180  6130  0 21:53 pts/2    00:00:00 ./helloComp1 c0
root       6181  6130  0 21:53 pts/2    00:00:00 ./helloComp1 c1

And depending on your logging level, you may see these applications outputting their status onto the console:

Thu Jul  6 15:37:06.874 2023 [main.cxx:376] (HelloNode1.6180.6180 : helloComp14.APP.MAIN:00041 : INFO) csa101: Active.  Hello World!
Thu Jul  6 15:37:06.874 2023 [main.cxx:377] (HelloNode1.6181.6181 : helloComp141.APP.MAIN:00041 : INFO) csa101: Standby.

Go ahead and kill one of these processes to see it automatically restarts and watch the logs to see it the active/standby roles fail over.

This concludes this basic evaluation. Congratulations you've built and deployed your first highly available application!

Troubleshooting

Fri Apr  8 14:07:07.388 2016 [nPlusmAmfPolicy.cxx:510] (.0.7671 : AMF.N+M.AUDIT:00996 : INFO) Service Instance [si] should be fully assigned but is [unassigned]. Current active assignments [0], targeting [1]
Fri Apr  8 14:07:07.388 2016 [nPlusmAmfPolicy.cxx:527] (.0.7671 : AMF.N+M.AUDIT:00997 : INFO) Service Instance [si] cannot be assigned 0th active.  No available service units.
Fri Apr  8 14:07:07.388 2016 [customAmfPolicy.cxx:36] (.0.7671 : AMF.POL.CUSTOM:00998 : INFO) Active audit
Fri Apr  8 14:07:07.388 2016 [customAmfPolicy.cxx:47] (.0.7671 : AMF.CUSTOM.AUDIT:00999 : INFO) Auditing service group sg0

Did you forget to set a node name? The node names in the model do not match the nodes running.

SAFplus AMF cannot spawn your application. It is possible that the binary does not exist, is not executable, or is not of this machine's architecture. "cat" the error.#### file to see the specific problem.

SAFplus: Evaluation Guide (last edited 2023-07-06 10:02:51 by HungTa)