User Tools

Site Tools


asc:tutorials:2014october_connect

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
asc:tutorials:2014october_connect [2014/10/09 19:00]
asc [Lesson 3: Analysis program to read xAOD]
asc:tutorials:2014october_connect [2014/10/10 15:28] (current)
asc [Lesson 6: Using HTCondor and Tier2]
Line 1: Line 1:
 ====== xAOD tutorial at ANL using ATLAS connect  ====== ====== xAOD tutorial at ANL using ATLAS connect  ======
-These tutorial are based on CERN +
-[[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODAnalysisInROOT | SoftwareTutorialxAODAnalysisInROOT wiki]].+
  
 **This tutorial uses [[http://connect.usatlas.org/| ATLAS connect]]. All lessons discussed in [[asc:tutorials:2014october| ASC/ANL tutorial]] were adopted by Ilija Vukotic for the use with ATLAS connect.** **This tutorial uses [[http://connect.usatlas.org/| ATLAS connect]]. All lessons discussed in [[asc:tutorials:2014october| ASC/ANL tutorial]] were adopted by Ilija Vukotic for the use with ATLAS connect.**
 +
 +Please also look at the CERN tutorial [[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODAnalysisInROOT | SoftwareTutorialxAODAnalysisInROOT wiki]].
  
  
Line 55: Line 56:
 <code bash> <code bash>
 source setup.sh source setup.sh
-rcSetup -u; rcSetup Base,2.1.11+rcSetup -u; rcSetup Base,2.1.12
 </code> </code>
  
Line 159: Line 160:
  
 Now we will make a number of changes to the above program. We will fill histograms with pT of jets and muons. Now we will make a number of changes to the above program. We will fill histograms with pT of jets and muons.
-To do this, added the following changes: +To do this, added the following changes: We have added:
- +
-In : cmt/Makefile.RootCore we have added:+
  
 <code> <code>
Line 167: Line 166:
 </code> </code>
  
-Then we have modified:+in "cmt/Makefile.RootCore"Then we have modified:
  
 <code python> <code python>
Line 174: Line 173:
 </code> </code>
  
-Now we will create a C++/ROOT analysis program and run over this input xAOD file. Do not forget to run "kinit [email protected]"+Now we will create a C++/ROOT analysis program and run over this input xAOD file. 
-Use the same setup file as above.+
  
-<code> +<code bash>
-source setup.sh+
 mkdir lesson_4; cd lesson_4 mkdir lesson_4; cd lesson_4
 +rcSetup -u; rcSetup Base,2.0.12
 +rc find_packages 
 +rc compile
 </code> </code>
  
 <code bash> <code bash>
-mkdir ANLanalysisHisto +wget https://ci-connect.atlassian.net/wiki/download/attachments/10780723/MyAnalysis4.zip 
-cd ANLanalysisHisto +unzip MyAnalysis4.zip
-rcSetup -u; rcSetup Base,2.1.11 +
-rc find_packages  # find needed packages +
-rc compile        # compiles them +
-</code> +
- +
-This takes some time to compile. Next we will us a simple example code that runs over multiple files located in some directory +
- +
-<code bash> +
-cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_4/ANLanalysis/MyAnalysis MyAnalysis+
 rc find_packages    # find this package rc find_packages    # find this package
 rc compile          # compiles it rc compile          # compiles it
-cd MyAnalysis/util  # go to the analysis code 
-testRun submitDir   # runs over all files inside inputdata.txt   
 </code> </code>
  
-How does this work? Your analysis code is testRun.cxx. We pass "submitDir" which will be the output directory with ROOT file. 
-You must delete it every time you run the code (or use different output). The code runs over a list of files inputdata.txt. 
-You can create a new list  using the script: 
  
-<code bash> +The actual analysis should is in  "Root/MyxAODAnalysis.cxx".
-python Make_input <directory with xAOD files> +
-</code> +
- +
-The actual analysis should be put to "Root/MyxAODAnalysis.cxx".+
  
 <note tip>Read more:  [[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODAnalysisInROOT#7_Creating_and_saving_histograms | Creating_and_saving_histograms]]</note> <note tip>Read more:  [[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODAnalysisInROOT#7_Creating_and_saving_histograms | Creating_and_saving_histograms]]</note>
-====== Lesson 5: Running a job on multiple cores ====== 
  
 +Run it as:
  
-Now we run the above job on multiple codes of same computer and merge histogram outputs at the end. 
- 
-Prepare a fresh directory: 
-<code> 
-source setup.sh 
-mkdir lesson_5; cd lesson_5 
-</code> 
- 
-And setup the package: 
 <code bash> <code bash>
-mkdir ANLanalysisHisto_threads 
-cd ANLanalysisHisto_threads 
-rcSetup -u; rcSetup Base,2.1.11 
-rc find_packages  # find needed packages 
-rc compile        # compiles them 
-</code> 
- 
-This takes some time to compile. Next we will us a simple example code that runs over multiple files located in some directory 
- 
-<code bash> 
-cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_5/ANLanalysis_threads/MyAnalysis/ ANLanalysis_threads/MyAnalysis/ 
-rc find_packages    # find this package 
-rc compile          # compiles it 
 cd MyAnalysis/util  # go to the analysis code cd MyAnalysis/util  # go to the analysis code
 +fax-get-gLFNs valid2.117050.PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534_tid01482225_00 > inputdata.txt
 +testRun submitDir   # runs over all files inside inputdata.txt
 </code> </code>
  
-This example has a modified. "util" directory compared to the previous example. For example, we changed 'testRun.cxx' file and included a 
-small script "A_RUN" that launches several jobs in parallel. Open the file "A_RUN" and study it.  What it does is this:  
  
-  * Generates input list "inputdata.txt" +====== Lesson 5: Running on multiple cores ======
-  * Splits it using a given number of threads (2 is default) and put them to **inputs/** directory +
-  * launches 2 jobs using input files from **inputs/**. Logfiles are also will be in that directory  +
-  * Output files will go to **outputs/** directory+
  
-The output will go to the directory "outputs", while input files and logfiles will be put to "inputs" directory. +This example is not needed for ATLAS connect. If you still want to know how to run an ATLAS analysis job on several cores of your desktop, 
 +look at [[asc:tutorials:2014october#lesson_5running_a_job_on_multiple_cores]]
  
-To run the job, run "./A_RUN". You can monitor jobs with the command:+====== Lesson 6Using HTCondor and Tier2 ======
  
-<code bash> +Lesson 5: Working on a Tier3 farm (Condor queue)
-ps rux  | grep -e "testRun" +
-</code>+
  
-Ones the jobs are done, you can merge the output files to "hist.root": +In this example we will use HTCondor workload management system to send the job to be executed in a queue at a Tier3 farm.  For this example we will start from the directory lesson 4so if you did not do the lesson 4 please do that one first and verify that your code runs locally
-<code bash> +Start from the new shell and set up environment, then create this shell script that will be executed at the beginning of each job at each farm node:
-hadd -f hist.root outputs/*/hist-run*.root +
-</code>+
  
-====== Lesson 6: Working on the ANL farm ======+<file bash startJob.sh> 
 +#!/bin/bash 
 +export RUCIO_ACCOUNT=YOUR_CERN_USERNAME 
 +export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase 
 +source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh 
 +localSetupFAX 
 +source $AtlasSetup/scripts/asetup.sh 19.0.2.1,noTest 
 +export X509_USER_PROXY=x509up_u21183 
 +unzip payload.zip 
 +ls 
 +rcSetup -u; rcSetup Base,2.0.12 
 +rc find_packages 
 +rc compile  
 +cd MyAnalysis/util 
 +rm submitDir 
 +  
 +echo $1 
 +sed -n $1,$(($1+2))p inputdata.txt > Inp_$1.txt 
 +cp Inp_$1.txt inputdata.txt 
 +cat inputdata.txt 
 +echo "startdate $(date)" 
 +testRun submitDir 
 +echo "enddate $(date)" 
 +</file>
  
-This example is ANL-specificIf you ran analysis code for RUN I, you should be a familiar with it. We used [[http://atlaswww.hep.anl.gov/asc/arcond/ | Condor/Arcod system]] to submit jobs to the farm. Data are distributed on the farm nodes to avoid IO bottleneck. +Make sure the RUCIO_ACCOUNT variable is properly setMake this file executable and create the file that describes our job needs and  that we will give to condor:
  
-<note important>This example works on the ANL Tier3 where Arcond/Condor is installed</note>+<file bash job.sub> 
 +Jobs=10 
 +getenv         = False 
 +executable     = startJob.sh 
 +output         = MyAnal_$(Jobs).$(Process).out 
 +error          = MyAnal_$(Jobs).$(Process).error 
 +log            = MyAnal_$(Jobs).$(Process).log 
 +arguments = $(Process) $(Jobs) 
 +environment = "IFlist=$(IFlist)" 
 +transfer_input_files = payload.zip,/tmp/x509up_u21183,MyAnalysis/util/inputdata.txt 
 +universe       = vanilla 
 +#Requirements   = HAS_CVMFS =?= True 
 +queue $(Jobs) 
 +</file>
  
-First, create directory for this example:+To access files using FAX the jobs need valid grid proxy. That's why we send it with each job. Proxy is the file starting with "x509up" so in both  job.sub and startJob.sh you should change "x509up_u21183" with the name of your grid proxy file. The filename you may find in the environment variable $X509_USER_PROXY.
  
-Prepare a fresh directory: +You need to pack all of the working directory into a payload.zip file:
-<code> +
-mkdir lesson_6 +
-</code>+
  
-Copy the needed directories with RootCore example:+<file bash startJob.sh> 
 +startJob.sh 
 +rc clean 
 +rm -rf RootCoreBin 
 +zip -r payload.zip * 
 +</file>
  
-<code bash> +Now you may submit your task for the execution and follow its status in this way:
-cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_6/ lesson_6/ +
-</code>+
  
-Then check that you can compile the analysis example using RootCore: 
- 
-<code bash> 
-cd lesson_6/ 
-source setup.sh 
-</code> 
- 
-As usual, our analysis is in "MyAnalysis/util/testRun.cxx" 
- 
-Now we want to submit jobs to the data distributed on several computers of the farm. Go to the upper directory and setup the farm: 
- 
-<code bash> 
-cd ../; source s_asc; 
-</code> 
- 
-We will send jobs using the directory "submit". Go to this directory and check the data: 
- 
-<code bash> 
-cd submit 
-arc_ls -s /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/ 
-arc_ls    /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/   
-</code> 
- 
-The first command shows the summary of distributed data (12 files per server), while the second lists all available data on each node. 
-Now we will send the job to the farm.  
-Change the line:  
 <code> <code>
-package_dir=/users/chakanau/public/2014_tutorial_october_anl/lesson_6/ANLanalysis+chmod 755 ./startJob.sh; ./startJob.sh;
 </code> </code>
-inside "arcond.conf" to reflect the correct path to your program. Then run "arcond" and say "y" to all questions. 
-This sends jobs to the farm (2 jobs per server). Check the status as: 
  
 <code bash> <code bash>
-condor_q+~> condor_submit job.sub 
 +Submitting job(s).......... 
 +10 job(s) submitted to cluster 49677. 
 +  
 +~> condor_q ivukotic 
 +-- Submitter: login.atlas.ci-connect.net : <192.170.227.199:60111> : login.atlas.ci-connect.net 
 + ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD               
 +49677.0   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 0     
 +49677.1   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 1     
 +49677.2   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 2     
 +49677.3   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 3     
 +49677.4   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 4     
 +49677.5   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 5     
 +49677.6   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 6     
 +49677.7   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 7     
 +49677.8   ivukotic       10/ 10:21   0+00:00:11 R  0   0.0  startJob.sh 8     
 +49677.9   ivukotic       10/ 10:21   0+00:00:10 R  0   0.0  startJob.sh 9     
 +  
 +10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended
 </code> </code>
- 
-When the jobs are done, the output files will be inside "Jobs" directory. Merge the ROOT outputs into one file as: 
- 
-<code bash> 
-arc_add 
-</code> 
-This will create the final output file "Analysis_all.root" 
- 
- 
-  
asc/tutorials/2014october_connect.1412881250.txt.gz · Last modified: 2014/10/09 19:00 by asc