This is an old revision of the document!
Authors:
J Taylor Childers (ANL HEP)
Tom Uram (ANL ALCF)
Balsam is an interface to a batch system's local scheduler. The each scheduler is abstracted such that Balsam remains scheduler independent.
ARGO is a workflow manager. ARGO can submit jobs to any system on which Balsam is running.
Both ARGO and Balsam are implemented in Python as django apps. This was done because django provides some services by default that are needed such as database handling and web interfaces for monitoring job statuses. django version 1.6 has been used and Python 2.6.6 (with GCC 4.4.7 20120313 (Red Hat 4.4.7-3)).
The communication layer is handled using RabbitMQ and pika. RabbitMQ is a message queue system. Message queues were an easy alternative to writing a custom TCP/IP interface. This requires installing and running a RabbitMQ server. RabbitMQ 3.3.1 is used with Erlang R16B02.
The data transport is handled using GridFTP. This requires installing and running a GridFTP server. Globus version 5.2.0 is used.
Jobs are submitted to ARGO via a RabbitMQ message queue. The messages use the python json serialization format. An example submission is:
'''
{
"preprocess": null,
"preprocess_args": null,
"postprocess": null,
"postprocess_args": null,
"input_url":"gsiftp://www.gridftpserver.com/path/to/input/files",
"output_url":"gsiftp://www.gridftpserver.com/path/to/output/files",
"username": "bob",
"email_address": "[email protected]",
"jobs":[
{
"executable": "zjetgen90_mpi",
"executable_args": "alpout.input.0",
"input_files": ["alpout.input.0","cteq6l1.tbl"],
"nodes": 1,
"num_evts": -1,
"output_files": ["alpout.grid1","alpout.grid2"],
"postprocess": null,
"postprocess_args": null,
"preprocess": null,
"preprocess_args": null,
"processes_per_node": 1,
"scheduler_args": null,
"wall_minutes": 60,
"target_site": "argo_cluster"
},
{
"executable": "alpgenCombo.sh",
"executable_args": "zjetgen90_mpi alpout.input.1 alpout.input.2 32",
"input_files": ["alpout.input.1","alpout.input.2","cteq6l1.tbl","alpout.grid1","alpout.grid2"],
"nodes": 2,
"num_evts": -1,
"output_files": ["alpout.unw","alpout_unw.par","directoryList_before.txt","directoryList_after.txt","alpgen_postsubmit.err","alpgen_postsubmit.out"],
"postprocess": "alpgen_postsubmit.sh",
"postprocess_args": "alpout",
"preprocess": "alpgen_presubmit.sh",
"preprocess_args": null,
"processes_per_node": 32,
"scheduler_args": "--mode=script",
"wall_minutes": 60,
"target_site": "vesta"
}
]
}
'''
mkdir /path/to/installation/argobalsam
export INST_PATH=/path/to/installation/argobalsam
cd $INST_PATH
virtualenv argobalsam_env
. argobalsam_env/bin/activate
pip install django==1.6.2
(necessary for python 2.6, if you have 2.7 you can go to higher django version)pip install south
pip install pika
django-admin.py startproject argobalsam_deploy
git clone [email protected]:balsam.git argobalsam_deploy
You may have to force git to write to an already existing folder, or checkout into one directory and move the files into the argobalsam_deploy
directory.argobalsam_deploy/argobalsam_deploy/settings.py
file to import the site-specific settings that are needed (may need a new one):from mira_settings import *