Differences

This shows you the differences between two versions of the page.

--- asc:tutorials:2014october_connect [2014/10/09 19:14]
asc [xAOD tutorial at ANL using ATLAS connect]
+++ asc:tutorials:2014october_connect [2014/10/10 15:26]
asc [Lesson 6: Using HTCondor and Tier2]
@@ Line 201: / Line 201: @@
 testRun submitDir   # runs over all files inside inputdata.txt
 </code>
-====== Lesson 5: Running a job on multiple cores ======
-Now we run the above job on multiple codes of same computer and merge histogram outputs at the end.
+====== Lesson 5: Running on multiple cores ======
-Prepare a fresh directory:
+This example is not needed for ATLAS connect. If you still want to know how to run an ATLAS analysis job on several cores of your desktop,
-<code>
+look at [[asc:tutorials:2014october#lesson_5running_a_job_on_multiple_cores]]
-source setup.sh
-mkdir lesson_5; cd lesson_5
-</code>
-And setup the package:
+====== Lesson 6: Using HTCondor and Tier2 ======
-<code bash>
-mkdir ANLanalysisHisto_threads
-cd ANLanalysisHisto_threads
-rcSetup -u; rcSetup Base,2.1.11
-rc find_packages  # find needed packages
-rc compile        # compiles them
-</code>
-This takes some time to compile. Next we will us a simple example code that runs over multiple files located in some directory
+Lesson 5: Working on a Tier3 farm (Condor queue)
-<code bash>
+In this example we will use HTCondor workload management system to send the job to be executed in a queue at a Tier3 farm.  For this example we will start from the directory lesson 4, so if you did not do the lesson 4 please do that one first and verify that your code runs locally.
-cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_5/ANLanalysis_threads/MyAnalysis/*  ANLanalysis_threads/MyAnalysis/
+Start from the new shell and set up environment, then create this shell script that will be executed at the beginning of each job at each farm node:
-rc find_packages    # find this package
-rc compile          # compiles it
-cd MyAnalysis/util  # go to the analysis code
-</code>
-This example has a modified. "util" directory compared to the previous example. For example, we changed 'testRun.cxx' file and included a
+<file bash startJob.sh>
-small script "A_RUN" that launches several jobs in parallel. Open the file "A_RUN" and study it.  What it does is this:
+#!/bin/zsh
+export RUCIO_ACCOUNT=YOUR_CERN_USERNAME
+export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
+source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh
+localSetupFAX
+source $AtlasSetup/scripts/asetup.sh 19.0.2.1,noTest
+export X509_USER_PROXY=x509up_u21183
+unzip payload.zip
+ls
+rcSetup -u; rcSetup Base,2.0.12
+rc find_packages
+rc compile
+cd MyAnalysis/util
+rm submitDir
+echo $1
+sed -n $1,$(($1+2))p inputdata.txt > Inp_$1.txt
+cp Inp_$1.txt inputdata.txt
+cat inputdata.txt
+echo "startdate $(date)"
+testRun submitDir
+echo "enddate $(date)"
+</file>
-  * Generates input list "inputdata.txt"
+Make sure the RUCIO_ACCOUNT variable is properly set. Make this file executable and create the file that describes our job needs and  that we will give to condor:
-  * Splits it using a given number of threads (2 is default) and put them to **inputs/** directory
-  * launches 2 jobs using input files from **inputs/**. Logfiles are also will be in that directory
-  * Output files will go to **outputs/** directory
-The output will go to the directory "outputs", while input files and logfiles will be put to "inputs" directory.
+<file bash job.sub>
+Jobs=10
+getenv         = False
+executable     = startJob.sh
+output         = MyAnal_$(Jobs).$(Process).out
+error          = MyAnal_$(Jobs).$(Process).error
+log            = MyAnal_$(Jobs).$(Process).log
+arguments = $(Process) $(Jobs)
+environment = "IFlist=$(IFlist)"
+transfer_input_files = payload.zip,/tmp/x509up_u21183,MyAnalysis/util/inputdata.txt
+universe       = vanilla
+#Requirements   = HAS_CVMFS =?= True
+queue $(Jobs)
+</file>
-To run the job, run "./A_RUN". You can monitor jobs with the command:
+To access files using FAX the jobs need a valid grid proxy. That's why we send it with each job. Proxy is the file starting with "x509up" so in both  job.sub and startJob.sh you should change "x509up_u21183" with the name of your grid proxy file. The filename you may find in the environment variable $X509_USER_PROXY.
-<code bash>
+You need to pack all of the working directory into a payload.zip file:
-ps rux  | grep -e "testRun"
-</code>
-Ones the jobs are done, you can merge the output files to "hist.root":
+<file bash startJob.sh>
-<code bash>
+startJob.sh
-hadd -f hist.root outputs/*/hist-run*.root
+rc clean
-</code>
+rm -rf RootCoreBin
+zip -r payload.zip *
+</file>
-====== Lesson 6: Working on the ANL farm ======
+Now you may submit your task for the execution and follow its status in this way:
-This example is ANL-specific. If you ran analysis code for RUN I, you should be a familiar with it. We used [[http://atlaswww.hep.anl.gov/asc/arcond/ | Condor/Arcod system]] to submit jobs to the farm. Data are distributed on the farm nodes to avoid IO bottleneck.
-<note important>This example works on the ANL Tier3 where Arcond/Condor is installed</note>
-First, create a directory for this example:
-Prepare a fresh directory:
 <code>
-mkdir lesson_6
+chmod 755 ./startJob.sh; ./startJob.sh;
 </code>
-Copy the needed directories with RootCore example:
 <code bash>
-cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_6/*  lesson_6/
+~> condor_submit job.sub
+Submitting job(s)..........
+job(s) submitted to cluster 49677.
+~> condor_q ivukotic
+-- Submitter: login.atlas.ci-connect.net : <192.170.227.199:60111> : login.atlas.ci-connect.net
+ ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD
+.0   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 0
+.1   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 1
+.2   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 2
+.3   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 3
+.4   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 4
+.5   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 5
+.6   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 6
+.7   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 7
+.8   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 8
+.9   ivukotic       10/9  10:21   0+00:00:10 R  0   0.0  startJob.sh 9
+jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
 </code>
-Then check that you can compile the analysis example using RootCore:
-<code bash>
-cd lesson_6/
-source setup.sh
-</code>
-As usual, our analysis is in "MyAnalysis/util/testRun.cxx"
-Now we want to submit jobs to the data distributed on several computers of the farm. Go to the upper directory and setup the farm:
-<code bash>
-cd ../; source s_asc;
-</code>
-We will send jobs using the directory "submit". Go to this directory and check the data:
-<code bash>
-cd submit
-arc_ls -s /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/
-arc_ls    /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/
-</code>
-The first command shows the summary of distributed data (12 files per server), while the second lists all available data on each node.
-Now we will send the job to the farm.
-Change the line:
-<code>
-package_dir=/users/chakanau/public/2014_tutorial_october_anl/lesson_6/ANLanalysis
-</code>
-inside "arcond.conf" to reflect the correct path to your program. Then run "arcond" and say "y" to all questions.
-This sends jobs to the farm (2 jobs per server). Check the status as:
-<code bash>
-condor_q
-</code>
-When the jobs are done, the output files will be inside "Jobs" directory. Merge the ROOT outputs into one file as:
-<code bash>
-arc_add
-</code>
-This will create the final output file "Analysis_all.root"

ATLAS Support Center

User Tools

Site Tools

Differences

Page Tools