Table of Contents

ARGO (A Rapid Generator Omnibus) & Balsam

Authors:
J Taylor Childers (ANL HEP)
Tom Uram (ANL ALCF)

Description & Versions

Balsam is an interface to a batch system's local scheduler. The each scheduler is abstracted such that Balsam remains scheduler independent.

ARGO is a workflow manager. ARGO can submit jobs to any system on which Balsam is running.

Both ARGO and Balsam are implemented in Python as django apps. This was done because django provides some services by default that are needed such as database handling and web interfaces for monitoring job statuses. django version 1.6 has been used and Python 2.6.6 (with GCC 4.4.7 20120313 (Red Hat 4.4.7-3)).

The communication layer is handled using RabbitMQ and pika. RabbitMQ is a message queue system. Message queues were an easy alternative to writing a custom TCP/IP interface. This requires installing and running a RabbitMQ server. RabbitMQ 3.3.1 is used with Erlang R16B02.

The data transport is handled using GridFTP. This requires installing and running a GridFTP server. Globus version 5.2.0 is used.

 Diagram of the ARGO - Balsam interactions.

Job Submission

Jobs are submitted to ARGO via a RabbitMQ message queue. The messages use the python json serialization format. An example submission is:

example_msg.txt
'''
{
   "preprocess": null,
   "preprocess_args": null,
   "postprocess": null,
   "postprocess_args": null,
   "input_url":"gsiftp://www.gridftpserver.com/path/to/input/files",
   "output_url":"gsiftp://www.gridftpserver.com/path/to/output/files",
   "username": "bob",
   "email_address": "[email protected]",
   "jobs":[
      {
       "executable": "zjetgen90_mpi",
       "executable_args": "alpout.input.0",
       "input_files": ["alpout.input.0","cteq6l1.tbl"],
       "nodes": 1,
       "num_evts": -1,
       "output_files": ["alpout.grid1","alpout.grid2"],
       "postprocess": null,
       "postprocess_args": null,
       "preprocess": null,
       "preprocess_args": null,
       "processes_per_node": 1,
       "scheduler_args": null,
       "wall_minutes": 60,
       "target_site": "argo_cluster"
       },
 
      {
       "executable": "alpgenCombo.sh",
       "executable_args": "zjetgen90_mpi alpout.input.1 alpout.input.2 32",
       "input_files": ["alpout.input.1","alpout.input.2","cteq6l1.tbl","alpout.grid1","alpout.grid2"],
       "nodes": 2,
       "num_evts": -1,
       "output_files": ["alpout.unw","alpout_unw.par","directoryList_before.txt","directoryList_after.txt","alpgen_postsubmit.err","alpgen_postsubmit.out"],
       "postprocess": "alpgen_postsubmit.sh",
       "postprocess_args": "alpout",
       "preprocess": "alpgen_presubmit.sh",
       "preprocess_args": null,
       "processes_per_node": 32,
       "scheduler_args": "--mode=script",
       "wall_minutes": 60,
       "target_site": "vesta"
      }
   ]
}
'''

Installation

  1. On Mira: soft add +python (to get python 2.7)
  2. Install virtualenv
  3. Create install directory: mkdir /path/to/installation/argobalsam
  4. export INST_PATH=/path/to/installation/argobalsam
  5. cd $INST_PATH
  6. Create virtual environment: virtualenv argobalsam_env
    1. On Edison:
      1. module load virtualenv
      2. module load python/2.7
  7. Activate virtual environment: . argobalsam_env/bin/activate
    1. On Edison:
      1. To use pip you need a certificate so run mk_pip_cabundle.sh, then include --cert ~/.pip/cabundle in all your pip commands.
  8. Install needed software:
    1. pip install django
      1. if you have less than python-2.7 you need pip install django==1.6.2
    2. pip install south
    3. pip install pika
    4. pip install MySql (only if needed)
      1. For this I had to install on SLC6 yum install mysql mysql-devel mysql-server
  9. Create django project: django-admin.py startproject argobalsam
  10. cd argobalsam
  11. git clone [email protected]:balsam.git argobalsam_git
  12. mv argobalsam_git/* ./
  13. rm -rf argobalsam_git/
  14. Update the following lines of argobalsam/settings.py:
    1. At the Top:
      from site_settings.mira_settings import *
    2. INSTALLED_APPS = (
          'django.contrib.admin',
          'django.contrib.auth',
          'django.contrib.contenttypes',
          'django.contrib.sessions',
          'django.contrib.messages',
          'django.contrib.staticfiles',
          'south',
          'balsam_core',
          'argo_core',
      )
    3. If you are using MySQL:
      DATABASES = {
          'default': {
              'ENGINE': 'django.db.backends.mysql',
              'NAME': 'your_table_name_goes_here',
              'USER': 'your_login',
              'PASSWORD': 'your_password',
              'HOST': '127.0.0.1',
              'PORT': '',
              'CONN_MAX_AGE': 2000000,
          }
      }
  15. Setup your database:
    1. For Older Django using South (pre-1.7):
      1.  python manage.py syncdb 
      2. Create the first migration
         python manage.py schemamigration balsam_core --initial 
      3. Apply the first migration
         python manage.py migrate balsam_core --fake 
      4. Create the first migration
         python manage.py schemamigration argo_core --initial 
      5. Apply the first migration
         python manage.py migrate argo_core --fake 
    2. For newer Django (1.7,1.8):
      1.  python manage.py syncdb 
    3. For Django (1.9+):
      1.  python manage.py migrate --fake-initial 

Git Tag Notes

Git Browser

5.0

4.1

4.0

3.2

3.1

3.0