User Tools

Site Tools


asc:tutorials:2014october_connect

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
asc:tutorials:2014october_connect [2014/10/09 19:14]
asc [xAOD tutorial at ANL using ATLAS connect]
asc:tutorials:2014october_connect [2014/10/10 15:28] (current)
asc [Lesson 6: Using HTCondor and Tier2]
Line 201: Line 201:
 testRun submitDir   # runs over all files inside inputdata.txt testRun submitDir   # runs over all files inside inputdata.txt
 </code> </code>
-====== Lesson 5: Running a job on multiple cores ====== 
  
  
-Now we run the above job on multiple codes of same computer and merge histogram outputs at the end.+====== Lesson 5: Running on multiple cores ======
  
-Prepare a fresh directory: +This example is not needed for ATLAS connectIf you still want to know how to run an ATLAS analysis job on several cores of your desktop, 
-<code> +look at [[asc:tutorials:2014october#lesson_5running_a_job_on_multiple_cores]]
-source setup.sh +
-mkdir lesson_5; cd lesson_5 +
-</code>+
  
-And setup the package: +====== Lesson 6Using HTCondor and Tier2 ======
-<code bash> +
-mkdir ANLanalysisHisto_threads +
-cd ANLanalysisHisto_threads +
-rcSetup -u; rcSetup Base,2.1.11 +
-rc find_packages  # find needed packages +
-rc compile        # compiles them +
-</code>+
  
-This takes some time to compile. Next we will us simple example code that runs over multiple files located in some directory+Lesson 5: Working on Tier3 farm (Condor queue)
  
-<code bash> +In this example we will use HTCondor workload management system to send the job to be executed in a queue at a Tier3 farm.  For this example we will start from the directory lesson 4, so if you did not do the lesson 4 please do that one first and verify that your code runs locally. 
-cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_5/ANLanalysis_threads/MyAnalysis/ ANLanalysis_threads/MyAnalysis/ +Start from the new shell and set up environment, then create this shell script that will be executed at the beginning of each job at each farm node:
-rc find_packages    # find this package +
-rc compile          # compiles it +
-cd MyAnalysis/util  # go to the analysis code +
-</code>+
  
-This example has a modified"util" directory compared to the previous exampleFor examplewe changed 'testRun.cxx' file and included a +<file bash startJob.sh> 
-small script "A_RUNthat launches several jobs in parallel. Open the file "A_RUNand study it.  What it does is this: +#!/bin/bash 
 +export RUCIO_ACCOUNT=YOUR_CERN_USERNAME 
 +export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase 
 +source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh 
 +localSetupFAX 
 +source $AtlasSetup/scripts/asetup.sh 19.0.2.1,noTest 
 +export X509_USER_PROXY=x509up_u21183 
 +unzip payload.zip 
 +ls 
 +rcSetup -u; rcSetup Base,2.0.12 
 +rc find_packages 
 +rc compile  
 +cd MyAnalysis/util 
 +rm submitDir 
 +  
 +echo $1 
 +sed -n $1,$(($1+2))p inputdata.txt > Inp_$1.txt 
 +cp Inp_$1.txt inputdata.txt 
 +cat inputdata.txt 
 +echo "startdate $(date)" 
 +testRun submitDir 
 +echo "enddate $(date)" 
 +</file>
  
-  * Generates input list "inputdata.txt" +Make sure the RUCIO_ACCOUNT variable is properly setMake this file executable and create the file that describes our job needs and  that we will give to condor:
-  * Splits it using a given number of threads (2 is default) and put them to **inputs/** directory +
-  * launches 2 jobs using input files from **inputs/**. Logfiles are also will be in that directory  +
-  * Output files will go to **outputs/** directory+
  
-The output will go to the directory "outputs", while input files and logfiles will be put to "inputs" directory+<file bash job.sub> 
 +Jobs=10 
 +getenv         = False 
 +executable     = startJob.sh 
 +output         = MyAnal_$(Jobs).$(Process).out 
 +error          = MyAnal_$(Jobs).$(Process).error 
 +log            = MyAnal_$(Jobs).$(Process).log 
 +arguments = $(Process) $(Jobs) 
 +environment = "IFlist=$(IFlist)" 
 +transfer_input_files = payload.zip,/tmp/x509up_u21183,MyAnalysis/util/inputdata.txt 
 +universe       = vanilla 
 +#Requirements   = HAS_CVMFS =?= True 
 +queue $(Jobs) 
 +</file>
  
-To run the job, run "./A_RUN"You can monitor jobs with the command:+To access files using FAX the jobs need a valid grid proxy. That's why we send it with each job. Proxy is the file starting with "x509up" so in both  job.sub and startJob.sh you should change "x509up_u21183" with the name of your grid proxy file. The filename you may find in the environment variable $X509_USER_PROXY.
  
-<code bash> +You need to pack all of the working directory into a payload.zip file:
-ps rux  | grep -e "testRun" +
-</code>+
  
-Ones the jobs are done, you can merge the output files to "hist.root": +<file bash startJob.sh
-<code bash> +startJob.sh 
-hadd -f hist.root outputs/*/hist-run*.root +rc clean 
-</code>+rm -rf RootCoreBin 
 +zip -r payload.zip 
 +</file>
  
-====== Lesson 6: Working on the ANL farm ======+Now you may submit your task for the execution and follow its status in this way:
  
-This example is ANL-specific. If you ran analysis code for RUN I, you should be a familiar with it. We used [[http://atlaswww.hep.anl.gov/asc/arcond/ | Condor/Arcod system]] to submit jobs to the farm. Data are distributed on the farm nodes to avoid IO bottleneck.  
- 
-<note important>This example works on the ANL Tier3 where Arcond/Condor is installed</note> 
- 
-First, create a directory for this example: 
- 
-Prepare a fresh directory: 
 <code> <code>
-mkdir lesson_6+chmod 755 ./startJob.sh; ./startJob.sh;
 </code> </code>
- 
-Copy the needed directories with RootCore example: 
  
 <code bash> <code bash>
-cp -/users/chakanau/public/2014_tutorial_october_anl/lesson_6/ lesson_6/+~> condor_submit job.sub 
 +Submitting job(s).......... 
 +10 job(s) submitted to cluster 49677. 
 +  
 +~> condor_q ivukotic 
 +-- Submitter: login.atlas.ci-connect.net : <192.170.227.199:60111> : login.atlas.ci-connect.net 
 + ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
 +49677.0   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 0     
 +49677.1   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 1     
 +49677.2   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 2     
 +49677.3   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 3     
 +49677.4   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 4     
 +49677.5   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 5     
 +49677.6   ivukotic       10/9  10:21   0+00:00:11 R  0   0.0  startJob.sh 6     
 +49677.7   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 7     
 +49677.8   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 8     
 +49677.9   ivukotic       10/ 10:21   0+00:00:10 R  0   0.0  startJob.sh 9     
 +  
 +10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
 </code> </code>
- 
-Then check that you can compile the analysis example using RootCore: 
- 
-<code bash> 
-cd lesson_6/ 
-source setup.sh 
-</code> 
- 
-As usual, our analysis is in "MyAnalysis/util/testRun.cxx" 
- 
-Now we want to submit jobs to the data distributed on several computers of the farm. Go to the upper directory and setup the farm: 
- 
-<code bash> 
-cd ../; source s_asc; 
-</code> 
- 
-We will send jobs using the directory "submit". Go to this directory and check the data: 
- 
-<code bash> 
-cd submit 
-arc_ls -s /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/ 
-arc_ls    /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/   
-</code> 
- 
-The first command shows the summary of distributed data (12 files per server), while the second lists all available data on each node. 
-Now we will send the job to the farm.  
-Change the line:  
-<code> 
-package_dir=/users/chakanau/public/2014_tutorial_october_anl/lesson_6/ANLanalysis 
-</code> 
-inside "arcond.conf" to reflect the correct path to your program. Then run "arcond" and say "y" to all questions. 
-This sends jobs to the farm (2 jobs per server). Check the status as: 
- 
-<code bash> 
-condor_q 
-</code> 
- 
-When the jobs are done, the output files will be inside "Jobs" directory. Merge the ROOT outputs into one file as: 
- 
-<code bash> 
-arc_add 
-</code> 
-This will create the final output file "Analysis_all.root" 
- 
- 
-  
asc/tutorials/2014october_connect.1412882080.txt.gz · Last modified: 2014/10/09 19:14 by asc