User Tools

Site Tools


asc:tutorials:2014october_connect

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
asc:tutorials:2014october_connect [2014/10/09 19:14]
asc [xAOD tutorial at ANL using ATLAS connect]
asc:tutorials:2014october_connect [2014/10/10 15:26]
asc [Lesson 6: Using HTCondor and Tier2]
Line 201: Line 201:
 testRun submitDir   # runs over all files inside inputdata.txt testRun submitDir   # runs over all files inside inputdata.txt
 </code> </code>
-====== Lesson 5: Running a job on multiple cores ====== 
  
  
-Now we run the above job on multiple codes of same computer and merge histogram outputs at the end.+====== Lesson 5: Running on multiple cores ======
  
-Prepare a fresh directory: +This example is not needed for ATLAS connectIf you still want to know how to run an ATLAS analysis job on several cores of your desktop, 
-<code> +look at [[asc:tutorials:2014october#lesson_5running_a_job_on_multiple_cores]]
-source setup.sh +
-mkdir lesson_5; cd lesson_5 +
-</code>+
  
-And setup the package: +====== Lesson 6Using HTCondor and Tier2 ======
-<code bash> +
-mkdir ANLanalysisHisto_threads +
-cd ANLanalysisHisto_threads +
-rcSetup -u; rcSetup Base,2.1.11 +
-rc find_packages  # find needed packages +
-rc compile        # compiles them +
-</code>+
  
-This takes some time to compile. Next we will us simple example code that runs over multiple files located in some directory+Lesson 5: Working on Tier3 farm (Condor queue)
  
-<code bash> +In this example we will use HTCondor workload management system to send the job to be executed in a queue at a Tier3 farm.  For this example we will start from the directory lesson 4, so if you did not do the lesson 4 please do that one first and verify that your code runs locally. 
-cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_5/ANLanalysis_threads/MyAnalysis/ ANLanalysis_threads/MyAnalysis/ +Start from the new shell and set up environment, then create this shell script that will be executed at the beginning of each job at each farm node:
-rc find_packages    # find this package +
-rc compile          # compiles it +
-cd MyAnalysis/util  # go to the analysis code +
-</code>+
  
-This example has a modified"util" directory compared to the previous exampleFor examplewe changed 'testRun.cxx' file and included a +<file bash startJob.sh> 
-small script "A_RUNthat launches several jobs in parallel. Open the file "A_RUNand study it.  What it does is this: +#!/bin/zsh 
 +export RUCIO_ACCOUNT=YOUR_CERN_USERNAME 
 +export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase 
 +source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh 
 +localSetupFAX 
 +source $AtlasSetup/scripts/asetup.sh 19.0.2.1,noTest 
 +export X509_USER_PROXY=x509up_u21183 
 +unzip payload.zip 
 +ls 
 +rcSetup -u; rcSetup Base,2.0.12 
 +rc find_packages 
 +rc compile  
 +cd MyAnalysis/util 
 +rm submitDir 
 +  
 +echo $1 
 +sed -n $1,$(($1+2))p inputdata.txt > Inp_$1.txt 
 +cp Inp_$1.txt inputdata.txt 
 +cat inputdata.txt 
 +echo "startdate $(date)" 
 +testRun submitDir 
 +echo "enddate $(date)" 
 +</file>
  
-  * Generates input list "inputdata.txt" +Make sure the RUCIO_ACCOUNT variable is properly setMake this file executable and create the file that describes our job needs and  that we will give to condor:
-  * Splits it using a given number of threads (2 is default) and put them to **inputs/** directory +
-  * launches 2 jobs using input files from **inputs/**. Logfiles are also will be in that directory  +
-  * Output files will go to **outputs/** directory+
  
-The output will go to the directory "outputs", while input files and logfiles will be put to "inputs" directory+<file bash job.sub> 
 +Jobs=10 
 +getenv         = False 
 +executable     = startJob.sh 
 +output         = MyAnal_$(Jobs).$(Process).out 
 +error          = MyAnal_$(Jobs).$(Process).error 
 +log            = MyAnal_$(Jobs).$(Process).log 
 +arguments = $(Process) $(Jobs) 
 +environment = "IFlist=$(IFlist)" 
 +transfer_input_files = payload.zip,/tmp/x509up_u21183,MyAnalysis/util/inputdata.txt 
 +universe       = vanilla 
 +#Requirements   = HAS_CVMFS =?= True 
 +queue $(Jobs) 
 +</file>
  
-To run the job, run "./A_RUN"You can monitor jobs with the command:+To access files using FAX the jobs need a valid grid proxy. That's why we send it with each job. Proxy is the file starting with "x509up" so in both  job.sub and startJob.sh you should change "x509up_u21183" with the name of your grid proxy file. The filename you may find in the environment variable $X509_USER_PROXY.
  
-<code bash> +You need to pack all of the working directory into a payload.zip file:
-ps rux  | grep -e "testRun" +
-</code>+
  
-Ones the jobs are done, you can merge the output files to "hist.root": +<file bash startJob.sh
-<code bash> +startJob.sh 
-hadd -f hist.root outputs/*/hist-run*.root +rc clean 
-</code>+rm -rf RootCoreBin 
 +zip -r payload.zip 
 +</file>
  
-====== Lesson 6: Working on the ANL farm ======+Now you may submit your task for the execution and follow its status in this way:
  
-This example is ANL-specific. If you ran analysis code for RUN I, you should be a familiar with it. We used [[http://atlaswww.hep.anl.gov/asc/arcond/ | Condor/Arcod system]] to submit jobs to the farm. Data are distributed on the farm nodes to avoid IO bottleneck.  
- 
-<note important>This example works on the ANL Tier3 where Arcond/Condor is installed</note> 
- 
-First, create a directory for this example: 
- 
-Prepare a fresh directory: 
 <code> <code>
-mkdir lesson_6+chmod 755 ./startJob.sh; ./startJob.sh;
 </code> </code>
- 
-Copy the needed directories with RootCore example: 
  
 <code bash> <code bash>
-cp -/users/chakanau/public/2014_tutorial_october_anl/lesson_6/ lesson_6/+~> condor_submit job.sub 
 +Submitting job(s).......... 
 +10 job(s) submitted to cluster 49677. 
 +  
 +~> condor_q ivukotic 
 +-- Submitter: login.atlas.ci-connect.net : <192.170.227.199:60111> : login.atlas.ci-connect.net 
 + ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
 +49677.0   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 0     
 +49677.1   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 1     
 +49677.2   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 2     
 +49677.3   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 3     
 +49677.4   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 4     
 +49677.5   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 5     
 +49677.6   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 6     
 +49677.7   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 7     
 +49677.8   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 8     
 +49677.9   ivukotic       10/ 10:21   0+00:00:10 R  0   0.0  startJob.sh 9     
 +  
 +10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
 </code> </code>
- 
-Then check that you can compile the analysis example using RootCore: 
- 
-<code bash> 
-cd lesson_6/ 
-source setup.sh 
-</code> 
- 
-As usual, our analysis is in "MyAnalysis/util/testRun.cxx" 
- 
-Now we want to submit jobs to the data distributed on several computers of the farm. Go to the upper directory and setup the farm: 
- 
-<code bash> 
-cd ../; source s_asc; 
-</code> 
- 
-We will send jobs using the directory "submit". Go to this directory and check the data: 
- 
-<code bash> 
-cd submit 
-arc_ls -s /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/ 
-arc_ls    /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/   
-</code> 
- 
-The first command shows the summary of distributed data (12 files per server), while the second lists all available data on each node. 
-Now we will send the job to the farm.  
-Change the line:  
-<code> 
-package_dir=/users/chakanau/public/2014_tutorial_october_anl/lesson_6/ANLanalysis 
-</code> 
-inside "arcond.conf" to reflect the correct path to your program. Then run "arcond" and say "y" to all questions. 
-This sends jobs to the farm (2 jobs per server). Check the status as: 
- 
-<code bash> 
-condor_q 
-</code> 
- 
-When the jobs are done, the output files will be inside "Jobs" directory. Merge the ROOT outputs into one file as: 
- 
-<code bash> 
-arc_add 
-</code> 
-This will create the final output file "Analysis_all.root" 
- 
- 
-  
asc/tutorials/2014october_connect.txt · Last modified: 2014/10/10 15:28 by asc