====== ArCond tutorial ======
[[http://atlaswww.hep.anl.gov/asc/arcond/|ArCond (ARgonne CONDor)]] is a front-end of Condor developed at ANL.
The main advantage of the ArCond is in possibility to run multiple jobs in parallel using data stored locally on each computing node, rather than using a single file storage on the NFS server (this is also possible, but not advisable). This leads to a significant performance improvement and and no any network load during data processing. Using this approach, one can build a low-cost scaleable cluster to process terabytes of data.
Currently, the program can be used to run over AOD, D3PD, generate MC and NLO.
ArCond (ARgonne CONDor) is a package for:
* **Users**
* for facilitating of running multiple jobs in parallel using a Tier T3G PC farm. The package was designed for a Tier-3 type of clusters consisting of a number of independent computer notes (PCs) with a local file storage on each node, and a common NFS-mounted user (or share) directories. No any services are necessary
* for retrieval and merging outputs
* list all files stored on the entire PC farm
* **Administrators:**
* for copying data on multiple nodes of the PC farm in parallel using dq2-get tool;
* uniformly distributing data on multiple nodes assuming a central file storage;
* administration of multiple nodes via ssh tunneling.
In addition, any custom command sequence defined in a submission script can be executed in parallel. For example, one can run multiple athena option files, perform user-specific tests, file merging etc. directly on the condor Linux boxes.
Arcond has been used at ANL ASC/Tier3 since 2009. Currently, about 300k files are managed by arcond (AOD/ESD/D3PD) with more than 5000 jobs submissions.
One advantage of Arcond is that simple and robust, and well suited for debugging.
It works very simple: each node has a file with the list of data files (refreshed by a cron job). All such "database" files are visible for all nodes (i.e. located on NFS or AFS). Arcond reads such files and determines how to split jobs and submit jobs using Condor.
====== Installation ======
===== Package installation =====
Download the latest version from: [[http://atlaswww.hep.anl.gov/asc/arcond/download/]]
Assume you have 3 PC boxes. Check that you have running condor (type "condor_status"). If you see all your CPU's, install the ArCond:
Get the package from the page Arcond download page:
wget http://atlaswww.hep.anl.gov/asc/arcond/arcond-[VERSION].tar.gz
where [VERSION] is a version number. Untar the package.
tar -vxfz arcond-* cd arcond-*
At this stage make decision about the location of database files which keep info about files located on the servers.
This is explained in Sect.[[asc:asc_arcond#file_database | Database]]. Usually, we use a NFS location such as "/users/condor/ArcondDB".
since this directory should be visible for all interactive nodes.
Go to "arcondtools/initialize.py" and correct 2 lines: the future location of ArcondDB? and XROOTD re-director name (only in case if you use XROOTD, this is not necessary!). For arcond without XROOTD, we use: ArcondDB=/users/condor/ArcondDB.
Then install the package:
python setup.py install --prefix=[install dir]
For example, if you want to put Arcond under a NFS 'share' directory, do it as:
mkdir /share/Arcond python setup.py install --prefix=/share/Arcond
Now you should setup ArCond. Source the arcond setup script according to your installation.
source [install dir]/etc/arcond/arcond_setup.sh
e.g., if Arcond is installed under /share/Arcond, set it as:
source /share/Arcond/etc/arcond/arcond_setup.sh
Now check that the environmental variable "ARCOND_SYS is set correctly. Now you can create a submission directory for testing:
mkdir test; cd test; arc_setup
After executing =arc_setup=, you will see the directory structure and files:
arcond.conf, DataCollector, Job, patterns user
Check out the ArCond help script:
arc_help
.
It prints the available commands:
--- ArCond help ---
arc_add >> Merge all output ROOT files located in Job/*/*
arc_check >> Check outputs
arc_clean >> Clear all submissions from previous runs
arc_cp >> Copy and rename all output files located in Job/*/*
arc_exe >> Run a shell script. Usage: arc_exe -i script.sh
arc_ls >> lists all files in a dataset. Usage: arc_ls . Examples: arc_ls /data1
arc_mv >> Move and rename all output files located in Job/*/*
arcond >> Main submission script for a T3g PC farm
arc_setup >> Setup script. Initialize ArCond directory structure
arc_split >> Split dataset for multiple nodes (admin tool)
arc_ssh >> Parallel ssh to the set of nodes (admin tool)
arcsync_data >> Divide data sample and copy to the hosts
arc_update >> ArCond update script
Finally. check the directory "pattern". It should have some default configuration scripts for condor, one per server. For example, if you want to run jobs on a server "atlas50.hep.anl.gov" (or any other), the file "schema.site.atlas50.cmd" should look as:
# set universe to run job in local condor queue
universe = vanilla
#IO_COMMANDS
WhenToTransferOutput = ON_EXIT_OR_EVICT
executable = SCRIPT
output = job.local.out
error = job.local.err
log = job.local.log
environment = RealID=USERNAME;
requirements = ( machine == "atlas50.hep.anl.gov" )
notification = never
It is important to have the correct name of the computer, in this case: "atlas50.hep.anl.gov". If you have a second machine, like "atlas51.hep.anl.gov", make a second file "schema.site.atlas51.cmd" which will look exactly as before, but replacing "atlas50.hep.anl.gov
Now the next step is to make a database which keeps tracks of all files on all servers with data.
===== File database =====
The task of this step is to make a script which scan over all disks where data are located and make a file which the list of such files.
This simple "database" will be read by arcond during the submission.
Create a directory on NFS such as "/users/condor/ArcondDB" (see the previous step where this directory was given
during the install step?).
Make sure you can write to this directory (if you root, allow root to write files in this directory).
Then this script can look as:
###############################################
#!/bin/bash
# Build DB for ArCond run
# S.Chekanov
HOSTD=`hostname`
DbStore='/users/condor/ArcondDB'
TT=`date "+%Y-%m-%d"`
DbFile=/tmp/${HOSTD}_${TT}
echo "HOSTNAME is: " `hostname`
echo "File discovery starts at: " `date`
rm -f $DbFile
rm -f $DbFile.gz
find /data1/ -size +1k -type f > $DbFile
find /data2/ -size +1k -type f >> $DbFile
find /data3/ -size +1k -type f >> $DbFile
find /data4/ -size +1k -type f >> $DbFile
find /data5/ -size +1k -type f >> $DbFile
# sleep random number (<1 min). Avoid NFS load if there are many servers with data
value=$RANDOM
t=`echo "scale=0; $value / 1000" | bc`
echo "ArCond= Sleep random number $t"
sleep $t
RESULTS_SIZE=`stat -c %s $DbFile`
if [ "$RESULTS_SIZE" -lt 100 ]
then
echo "Result file is too small. Probably no disk" exit 1;
fi
# gzip file
gzip $DbFile
cp $DbFile.gz $DbStore/
# ##### REMOVE FILES WITH THE SAME SIZES.
tmp_1=/tmp/tmp.${RANDOM}$$
# signal trapping and tmp file removal
trap 'rm -f $tmp_1 >/dev/null 2>&1' 0
trap "exit 2" 1 2 3 15
# main
cd $DbStore
for a in ${HOSTD}*; do
f_size=$(set -- $(ls -l -- "$a"); echo $5)
find . -maxdepth 1 -type f ! -name "$a" -size ${f_size}c > $tmp_1
[ -s $tmp_1 ] && { echo SAME SIZE; rm -f $a; cat $tmp_1; }
done
# do it again to get the most recent
cp $DbFile.gz $DbStore/
echo "File discovery ends at: " `date`
exit 0
################### END #####################################
The script assumes that the data are located anywhere on the disks /data1 and /data2 etc., but you can put any location.
Try to run this script on you machine where the data are. It creates a file similar to:
HOSTNAME_DATE.gz
Open this file in "vim" and you can see its structure (vim can unzip this file on fly).
You should run this script on every server where the data are. This can be done using a cron job. Make a cron file "arc_db.sh":
# run it as: crontab foo.cron
# check it as: crontab -l
# remove it as: crontab -r
#
# run once every 3 hours
0 */3 * * * /users/condor/ArcondDB/buildDB.sh > /dev/null 2>&1
It executes this bash script every 3h and rebulds the database. This is usually enough for many sites. According to benchmarks, 300000 files on all directories can be scanned for 10 seconds.
Remember, during the previous step, we set the directory with the database in the initialization file:
/basic/initialize.py
(now it is set to /users/condor/ArcondDB)
====== Working with ArCond ======
Start from a new shell window. If you have setup an Atlas release already, clean all the environmental variables as described in
Run the setup script as explained above. Say you will use a directory called 'test'. Setup Arcond from it as:
mkdir test; cd test; arc_setup
* After executing =arc_setup=, you will see the directory structure and files:
arcond.conf, DataCollector, Job, patterns user
. Check out ArCond help script. It prints available commands:
arc_help
Try to run a custom shell script on all Linux boxes in parallel. An example bash script "example.sh" is located inside the Arcond folder. Execute the statement:
arc_exe -i example.sh
. you should wait for 1-2 min. ArCond sends this script to all Linux boxes in parallel and prints out the output with the server responses. In future, this would be a preferred way to upload data on each Linux box, as the ssh port will be not available for the PC farm.
You can put any shell statement in the input file "example.sh". More generally, one can run any custom script as:
arc_exe -i
if it fails, run the command =arc_exe= with the additional option "-v".
PCs which are included in the submission are defined in files located in the =patterns= directory. For each PC box, there should be one file: for example, file which includes "atlas51.hep.anl.gov" into the PC farm is called "schema.site.atlas51.cmd"
====== Checking available data ======
It is good idea to check the availability of your data in the ArCond static database. As example, you can list all files on all boxes as:
arc_ls DATASET
Example:
arc_ls mc08.106379.PythiaPhotonJet_AsymJetFilter.recon.AOD.e347_s462_r541
You can check all data available on a specific disk ("/data1") as
arc_ls /data1
or data which are in subdirectory "/data1/MonteCarlo" :
arc_ls /data1/MonteCarlo/
If you want to list only directories, use this syntax:
arc_ls -d /data1/
This is especially useful if you want to check which runs are available.
To generate a summary of all files on all nodes, use this:
arc_ls -s /data1/
One can use also a pattern matching, similar to the linux "grep". For example, to show only AOD files, use this example:
arc_ls -e "AOD"
====== Running athena packages======
ArCond allows to run any athena program in parallel even if data are stored locally on many Linux boxes. For example, a data sample can be stored on local disks of several PC as:
On atlas51: /data1/run/fileAOD1.root /data1/run/fileAOD2.root
On atlas52: /data1/run/fileAOD3.root /data1/run/fileAOD4.root
On atlas53: /data1/run/fileAOD6.root /data1/run/fileAOD7.root
You do not need to know which files are located on which box, since the data discovery is done automatically. The only important thing to remember is that the data should be stored in a unique path, for example, "/data1/run" on all PC boxes. Also, if some files are duplicated on different boxes, ArCond will skip such files. If data sets are small, they can be located only on one or two Linux boxes. The ArCond will skip PC notes which do not have any data.
The available data sets loaded to the PC farm are discussed on the web page: [[http://atlaswww.hep.anl.gov/twiki/bin/view/Workbook/DataPCF|Data sets at ASC]]
You can find out which data are located o which box by putting the string :
ls -l /data1/run
====== Submitting a custom program ======
The example below assumes that we will want to run an athena package. Assume we want to run an athena user package in "14.5.1/PhysicsAnalysis/AnalysisCommon/UserAnalysis". To submit "UserAnalysis" program to the PC farm is rather easy. Just run the command =arcond= from the directory with the ArCond installation and answer 4 questions (the answer is usually "y").
The program reads a configuration file "arcond.conf" where the path to your package is defined. Open it and check what is inside. The file specifies the input data location, release number, how many events to process by each core. If input data stored not on local disks but on a common file server (bad idea for a parallel processing!), you can set "storage_type=central". It is advisable to use locally installed atlas release (like 14.2.21 or 14.5.1) since this release is installed locally on the PC farm. Other releases also work, but they are accessible via NFS and this will significantly degrades performance. By default, all available cores are used, but one can limit the number of execution core if you will need more memory for the program execution.
arcond
It will read your input configuration file ("arcond.conf") and will ask you:
1) Rebuild the database with the input files (f/s/n)? Say "s"
When you say "s", the data discovery will be done rather quickly using a static database updated every 24h. This means that if someone pulls out data from the grid and put to the PC farm, you may not see necessary files on the farm until the next day.
If you know that data have been recently copied from the grid, you may choose to say "f". The data data discovery may take more time, since ArCond will send small pilot jobs using the condor to discover the data. This method has some disadvantage - if the PC farm is busy, it takes very long time to find your data.You can use the option "f" if the data are located on a central file storage.
Both discovery methods will build a flat-file databases in the directory =DataCollector/= with a list of input files for each PC box. If you will say "n", ArCond will use the data files created during a previous run:
2) Do you want to rebuild "GammaJetExample.tgz" Say "n"
If you will say "n", ArCond will send the existing "GammaJetExample.tgz" project file from the directory "Job" (if it does exist, it is very likely you have already submitted this program before). As you may guess, GammaJetExample.tgz is just a tar file of the directory which has been specified in the input configuration file "arcond.conf". Of couse, the name can be different, this depend on the package name you have put to the"arcond.conf" file.
If you will say "y", it will rebuild this tar file using the directory specified in "arcond.conf" file. Then you will be asked:
3) Do you want to prepare the submission scripts (y/n)? Say "y"
This will create submission directories inside "Job/" directory.
4) Submit all collections to the condor? (y/n)? Say "y"
("y" submits the condor jobs). If you say no, arcond will exit without submitting the jobs and you can check scripts which will be sent to the farm in "Jobs" directory. Pay attenstion to "SubmitScrip.sh" - this script will be executed on each node.
Then ArCond will send all jobs to the PC farm.
====== Checking submission status ======
Check the submission status as "condor_q" or "condor_status". If you will see that your jobs are in idle state (status "I"), check why the PC farm busy as:
condor_q -global
or condor_status -submitters
These two commands will tell you who is running currently on the PC farm. Try also:
arc_check
which checks the existence of the output files in each directory inside the "Job" directory.
* When jobs will be done, the script "arc_check" will tell that the output is ready. The output files (Analysis.root) from each core will be located in the directory "Job/run<ID>"
* Merge all output files in one root file as:
arc_add
* Finally, clean unnecessary submission scripts as:
arc_clean
Note: You should always wait until the end of condor jobs (check this with "condor_q", there should be no any jobs there). If some jobs are still running, you should not remove "Job/run<ID>" directories using the command =arc_clean-=. Also, you will not be able to send new ArCond jobs. Always make sure that all condor jobs have finished. If not, remove them by hand using the command =condor_rm ID=.
ArCond keeps track of user's submissions in a history file. Please check the file $HOME/.arcond_history. This file specifies which data were accessed, which release version, which package and time of the job submission. An administrator can collect such files and check which data should be kept and which data should be removed.
====== Checking the outputs ======
If your jobs are successful, you will see the output ROOT files in the directory. The task of the script "arc_add" is simply to collect all such files and append it to "Analysis_all.root" in the current directory. You can find individual
ROOT files in the directory:
Job/runNUMBER_HOSTNAME
where NUMBER is a number from 0-N(hosts), and HOSTNAME is the name of the host.
They have copied since the last line of the submission script "user/ ShellScript _BASIC.sh" contains the line "cp *.root DIR". If you do not see the ROOT file, check your log file "Analysis.log.gz".
You can also check what has happened on the host by looking at:
Job/runNUMBER_HOSTNAME/Job.ShellScript.HOSTNAME
See the files "job.local.out" and "job.local.err".
===== Canceling jobs =====
You can cancel jobs while they are running as:
condor_rm ID
or all jobs as:
condor_rm username
where ID is a job number. You can remove all jobs using the same command replacing the job ID with the user name.
====== Compile or not compile ======
If you have tested your code on interactive nodes, you will not need to compile the code again on slave nodes. "cmt config; source setup.sh; make"
. You can remove these lines and instead put these lines which copy your InstallArea and and Package: (this is only an example):
cd $TESTAREA_CONDOR/$ATLAS_RELEASE
cp -r /home/workarea/15.6.1/InstallArea .
cp -r /home/workarea/15.6.1/Package .
Note: compilation can take some time (15-30 min) if an ATLAS release is installed on NFS/AFS, rather than on a local disk. This is because the setup script will hit NFS at the same time once condor jobs start. To check your modifications of "InstallArea", run the program first in the debug mode (see next).
====== Debugging your code ======
ArCond is written keeping in mind that a user is suppose to debug and control jobs at every stage of submission. One can:
* Debug the submission process
* Debug the correctness of the submitted shell script
* Debug problems associated with a particular computer node.
First, try to run "arcond" and at the very last stage (when it asks "Submitting the jobs?") say "no". Then go into any directory "Job/run" and check it's structure. If you will need to simulate what happens on a single core, go to the directory " run0_HOSTNAME" and run "ShellScript.sh". This script does everything what is suppose to happen on a dedicated core. If it fails, all your jobs will fail. Correct the script "ShellScript.sh" and make the corresponding changes in "user/ShellScript_BASIC.sh"
You can also debug a particular computing node if you ssh to the faulty node and go to the directory "/home/condor/exec/ID/testarea/... /condor/". You can run your code exactly as on an interactive node.
====== Run C++/ROOT over D3PD ======
Although the Arcond was created focusing on Athena-type of jobs, one can run any user programs. In this tutorial, we will explain how to run a simple C++ program over many files on a PC farm using ArCond.
wget http://atlaswww.hep.anl.gov/asc/arcond/tutotials/example.tgz
tar -zvxf example.tgz
cd example/
You will see 2 directories: "package" with your C++/ROOT code (can be executed locally) and arc (ArCond submission directory from where you can submit this C++ package to a PC farm).
Go to "package" and check what is inside. This is a simple skeleton C++ program to run over ROOT files in a chain.
You can compile "main.cxx" code typing "make" (ROOT must be setup first!). This will create an executable "main". It will read "data.in" with input ROOT files (each file name should be on a separate line) and "main.ini" where you can specify how many events to process. The source code "main.cxx" opens and closes input ROOT files and generate "Analysis.root" with output histograms.
Now we will submit this package to the PC farm with distributed input files. Go to the directory "arc" correct computer names in "pattern" directory and specify the location of files (the variable "data_input"). Now you are ready to submit!
Type "arcond" from the same directory where "arcond.conf" is located. Say "y" to all questions. When it asks about "a static database", say "s".
You can review what exactly is executed on the farm: look at "user/ShellScript_BASIC.sh" and scroll down to the end of the file.
############## user-defined part ##################
# setup ROOT or anything you want
source /share/grid/app/asc_app/asc_rel/1.0/setup-script/set_asc.sh
# compile and run your package (inside the "package" directory)
echo "ArCond: Compiling the package and run it:"
make -f Makefile;
./main > Analysis.log 2>&1
echo 'Copy all files to the submission directory:'
ls -alrt
cp *.log $JobDIR/
cp *.root $JobDIR/
###################################################
For a different site, you will need to add ROOT setup.
After jobs have finished, check the output as "arc_check". You can merge all files (located inside ./Jobs directory)
as "arc_add". You can move and rename files as "arc_mv".
====== Running ArCond jobs for ANL ======
There are several examples in the directory:
/users/chakanau/public/Arcond
showing how to run C++ without input data (example_nodata), now to run Pythia8 on many cores (example_pythia8) and a simple example running over D3PD (example). There are also examples of how to run athena jobs (which is the main goal of the Arcond package)
Look at the detailed instruction [[asc:workbook_cpfsubmitting_data_to_the_farm]].
====== Running jobs without input data =======
You can run jobs without input data by specifying "data_input=" (no input) in arcond.conf file.
You still need to specify "package_dir =
arc_setup_admin
It will create a set of files: arc_command, arc_hosts, arc_sync, hosts.py
Look at the python file "hosts.py" and add there your hosts you are trying to manage. The task of this Python function is to return a list of cluster hosts by calling the function getHosts().
Using this function, one can do several things:
Checking hosts of the PCF
Check: Are all the hosts alive? Execute: arc_hosts This script pings all hots in multiple threads and prints hosts which are alive.
Running an arbitrary shell command on the PCF
Execute an arbitrary command on all hosts: Open the file " arc_command" and modify the command to be executed. Then run this script.
Distributing data between multiple hosts
This will help to copy data files located on some central storage to all computer farm nodes. The number of nodes and their names are given by the external function "hosts.py" described above.
The task of data redistribution can be achieved using several linux commands which are wrapped inside the Python script "arc_sync" for easy use. Look at the file "arc_sync" and read the instructions. This Python module reads hosts from "host.py", makes a list with all files in a central storage and creates lists of files depending on the number of computer nodes. All such lists are stored in external files which are created on fly. Then data are synced via ssh tunneling (see the option given in arc_sync).
If your data have been changed on the central storage, you can rerun "arc_sync" and this will refresh the files on multiple hosts,
Run the arc_sync in the debug mode first (open the arc_sync and change the variable "Debug=True". You will see what this script is doing. You will also see several files ("list_HOSTNAME.txt") with the list of files which will be moved to the slave nodes. Then do actual copy by changing "Debug=False".
=
--- //[[chekanov@gmail.com|Sergei Chekanov]] 2011/03/09 17:44//