====== xAOD tutorial at ANL (October 28-29, 2014) ======
This tutorial is based on CERN
[[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODAnalysisInROOT | SoftwareTutorialxAODAnalysisInROOT wiki]].
However, the lessons given below are simplified for a faster start. Also, the last 2 lessons are designed for the ANL cluster that uses condor for job submissions.
In addition, we we will test US ATLAS connect as explained at the bottom of this page.
Agenda of this tutorial is [[https://indico.cern.ch/event/330836/]].
====== Getting started ======
We would like to encourage you all to obtain a computer account already at your local Tier3.
That is the recommended way to do the tutorials since, eventually, you may want to run your
xAOD analysis at home institutes. If needed, we can also provide you an account at Argonne Tier3.
* Please make sure that your local Tier3 has the usual ATLAS setup environment based on "cvmfs".
* Several tutorial topics will require the grid access (use voms-proxy-init -voms atlas to check this).
ANL people can use "atlas1.hep.anl.gov" or atlas2.hep.anl.gov.
First, we will setup ATLAS release.
1) Setup kinit if needed: **kinit username@CERN.CH** where username is your user name on lxplus. This step is normally not needed.
2) Create a bash script to setup ATLAS software. The method below works at ANL:
#!/bin/bash
export ALRB_localConfigDir="/share/sl6/cvmfs/localConfig"
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh
#asetup 19.1.1.4,slc6,here # setup ATLAS release when needed
setupATLAS
and then do "source setup.sh". This works at ANL/ASC. Make a similar script when using lxplus or other Tier3 as explained in [[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODAnalysisInROOT#1_Setup_the_Analysis_Release | CERN tutorial]].
On LXPLUS:
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
alias setupATLAS='source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh'
setupATLAS
To run tutorials, you can check these requirements (prepared by Asoka De Silva):
* You can check that your computer is ATLAS ready by doing (SL, SLC, CentOS) version 6):
setupATLAS
diagnostics
checkOS
* You will need a valid grid certificate registered with LCG and installed in $HOME/.globus. You can check that your grid credentials are good by doing:
setupATLAS
diagnostics
gridCert
* Available disk space:
You will need 500 MB of free space. It can be $HOME or any other location - note the location as you will be asked for it. If you are running on lxplus, you can request your home dir quota be increased to 10GB or request up to an additional 100GB of workspace. To ask for space, go to (use your lxplus username below)
https://resources.web.cern.ch/resources/Manage/AFS/Settings.aspx?login=
You may need to request repeatedly to get the full quota (it is granted in increments.). To check how much quota you have on /afs (eg on lxplus), type fs lq; eg.
fs lq /afs/cern.ch/user/d/desilva
fs lq /afs/cern.ch/work/d/desilva # for workspace if you have it
====== Lesson 1: Looking at a xAOD file ======
Let's us take a look at a typical xAOD file. You can open it with ROOT TBrowser to study its structure.
mkdir lesson_1
cd ./lesson_1
If you want to get an example xAOD file without running the above command, let's do it slowly:
__Method 1. If you are at ANL, copy it__:
cp /data/nfs/chakanau/tutorial_xAOD_long/valid2.117050.PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534_tid01482225_00/AOD.01482225._000140.pool.root.1 AOD.01482225._000140.pool.root
__Method 2. Use xrdcp:__
localSetupFAX --rootVersion=current-SL6
voms-proxy-init -voms atlas
xrdcp $STORAGEPREFIX/atlas/rucio/valid2:AOD.01482225._000140.pool.root.1 AOD.01482225._000140.pool.root
__Method 3: If you work at CERN, use this file:__
/afs/cern.ch/atlas/project/PAT/xAODs/r5597/data12_8TeV.00204158.physics_JetTauEtmiss.recon.AOD.r5597/AOD.01495682._003054.pool.root.1
/afs/cern.ch/atlas/project/PAT/xAODs/r5591/mc14_8TeV.117050.PowhegPythia_P2011C_ttbar.recon.AOD.e1727_s1933_s1911_r5591/AOD.01494882._111853.pool.root.1
__Method 4: If you work at your Tier3__
setupATLAS
diagnostics
setMeUpData anl-oct2014 mydata
The file will appear in "mydata/tutorial/anl-oct2014/datase/" directory
Now open this file with TBrowser:
root
TBrowser a
Click the file name using the left panel, Select "CollectionTree" and find a branch with the extension "Aux" (for example, "AntiKt4TruthJetsAux") and then plot "AntiKt4TruthJetsAux.pt". See other detail in [[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODEDM#Browsing_the_xAOD_with_the_TBrow | CERN
tutorial]]
====== Lesson 2: Using pyROOT to read xAOD ======
Now we will read the above line in Python and print some values for electron (eta and phi)
First, we setup our environment in the directory "lesson_1" that has the xAOD file:
source setup.sh
rcSetup -u; rcSetup Base,2.0.12
Now you can check what versions of packages are linked:
rc version
that shows a long version of package versions.
The let's create a new directory and put this file:
#!/usr/bin/env python
# Set up ROOT and RootCore:
import ROOT
ROOT.gROOT.Macro( '$ROOTCOREDIR/scripts/load_packages.C' )
ROOT.xAOD.Init() # Initialize the xAOD infrastructure
fileName="AOD.01482225._000140.pool.root" # Set up the input files
treeName = "CollectionTree" # default when making transient tree anyway
f = ROOT.TFile.Open(fileName)
t = ROOT.xAOD.MakeTransientTree( f, treeName) # Make the "transient tree"
# Print some information:
print( "Number of input events: %s" % t.GetEntries() )
for entry in xrange( t.GetEntries() ):
t.GetEntry( entry )
print( "Processing run #%i, event #%i" % ( t.EventInfo.runNumber(), t.EventInfo.eventNumber() ) )
print( "Number of electrons: %i" % len( t.ElectronCollection ) )
for el in t.ElectronCollection: # loop over electron collection
print( " Electron trackParticle eta = %g, phi = %g" % ( el.trackParticle().eta(), el.trackParticle().phi() ) )
pass # end for loop over electron collection
pass # end loop over entries
f.Close()
If you copy this script, correct indentation (the lines should start from the column position 0). This is a good for your to learn this code!
Then run it as:
chmod +x xAODPythonMacro.py
./xAODPythonMacro.py
Using this code, one can fill histograms. But the code runs slow. Below we will show how to use C++/ROOT compiled code to
run over this file.
How will you find xAOD variables without using ROOT TBrowser? Try this code:
asetup 19.1.1.1,slc6,gcc47,64,here
checkSG.py AOD.01482225._000140.pool.root
You will see a table with the names of the variables.
Using this code, one can fill histograms. But the code runs slow. Below we will show how to use C++/ROOT compiled code to
run over this file.
To run next code, you will need to start from a new terminal, since the lines "asetup 19.1.1.1 conflicts with rcSetup Base,2.0.12
How will you find xAOD variables without using ROOT TBrowser? Try this code:
asetup 19.1.1.1,slc6,gcc47,64,here
checkSG.py AOD.01482225._000140.pool.root
You will see a table with the names of the variables.
Now you can fill a histogram in this Python code. You should create a histogram before the event loop:
from ROOT import *
h1=TH1D("eta","eta",20,-4,4)
Then fill it in the event loop as h1.Fill( el.trackParticle().eta() ). Then we will write this histogram in a file as:
hfile=TFile("test.root","RECREATE","ANL tutorial")
h1.Write()
hfile.Close()
Here the code that write a ROOT histogram:
#!/usr/bin/env python
import ROOT
ROOT.gROOT.Macro( '$ROOTCOREDIR/scripts/load_packages.C' )
ROOT.xAOD.Init() # Initialize the xAOD infrastructure
fileName="AOD.01482225._000140.pool.root" # Set up the input files
treeName = "CollectionTree" # default when making transient tree anyway
f = ROOT.TFile.Open(fileName)
t = ROOT.xAOD.MakeTransientTree( f, treeName) # Make the "transient tree"
from ROOT import *
h1=TH1D("eta","eta",20, -4,4)
# Print some information:
print( "Number of input events: %s" % t.GetEntries() )
for entry in xrange( t.GetEntries() ):
t.GetEntry( entry )
print( "Processing run #%i, event #%i" % ( t.EventInfo.runNumber(), t.EventInfo.eventNumber() ) )
print( "Number of electrons: %i" % len( t.ElectronCollection ) )
for el in t.ElectronCollection: # loop over electron collection
h1.Fill( el.trackParticle().eta() )
pass # end for loop over electron collection
pass # end loop over entries
hfile=TFile("test.root","RECREATE","ANL tutorial")
h1.Write()
f.Close()
hfile.Close()
Now open the file "test.root" and look at the histogram.
====== Lesson 3: C++/ROOT program to read xAOD ======
Now we will create a C++/ROOT analysis program and run over this input xAOD file. Do not forget to run "kinit username@CERN.CH".
Use the same setup file as above.
mkdir lesson_3; cd lesson_3
source setup.sh # setup ATLAS enviroment
rcSetup -u; rcSetup Base,2.0.12
rc find_packages # find needed packages
rc compile # compiles them
This takes some time to compile. Next we will us a simple example code that runs over multiple files located in some directory
curl http://atlaswww.hep.anl.gov/asc/xaod_tutor_oct2014/MyAnalysis_lesson3.tgz | tar -xz;
rc find_packages # find this package
rc compile # compiles it
cd MyAnalysis/util # go to the analysis code
We are ready to run the code, which is "testRun.cxx"
But we first should define a list with input data. This example reads file "inputdata.txt" with location of xAOD files. You can create it as:
python Make_input
Now the input file is ready and we run the example as.
In case if you need xAOD data, do this:
mkdir data
cd data
localSetupFAX --rootVersion=current-SL6
voms-proxy-init -voms atlas
xrdcp $STORAGEPREFIX/atlas/rucio/valid2:AOD.01482225._000140.pool.root.1 .
xrdcp $STORAGEPREFIX/atlas/rucio/valid2:AOD.01482225._000141.pool.root.1 .
xrdcp $STORAGEPREFIX/atlas/rucio/valid2:AOD.01482225._000142.pool.root.1 .
This will copy 3 test files.
and run:
cd ..
python Make_input data
Now we are ready to run:
testRun submitDir # runs over all files inside inputdata.txt
We pass "submitDir" which will be the output directory with ROOT file.
You must delete it every time you run the code (or use different output).
The actual analysis should be put to "Root/MyxAODAnalysis.cxx" (will come back to this later).
The output histogram is ""submitDir/hist-sample.root" (which is almost empty).
If the program fails saying that some shared library has a wrong format, clean ROOTCORE as "rc clean", and then compile everything again with "rc find_packages; rc compile"
====== Lesson 4: Filling histograms ======
Now we will make a number of changes to the above program. For a quick start, do this:
mkdir lesson_4; cd lesson_4
Then setup atlas environment:
source setup.sh # setup ATLAS environment
rcSetup -u; rcSetup Base,2.0.12
rc find_packages # find needed packages
rc compile # compiles
Next we will make a simple example code that runs over multiple files located in some directory.
curl http://atlaswww.hep.anl.gov/asc/xaod_tutor_oct2014/MyAnalysis_lesson4.tgz | tar -xz;
wget http://atlaswww.hep.anl.gov/asc/xaod_tutor_oct2014/data12_8TeV.periodAllYear_DetStatus-v61-pro14-02_DQDefects-00-01-00_PHYS_StandardGRL_All_Good.xml
rc find_packages # find this package
rc compile # compiles it
cd MyAnalysis/util # go to the analysis code
The main analysis is in Root/MyxAODAnalysis.cxx. Please change the path to goodrunlist (on line 108) and recompile with "rc compile".
Now prepare an input file with the data from a directory with xAOD files:
python Make_input
and run the analysis:
testRun submitDir # runs over all files inside inputdata.txt
How does this work? Your analysis code is testRun.cxx. We pass "submitDir" which will be the output directory with ROOT file.
The actual analysis should be put to "Root/MyxAODAnalysis.cxx" as explained above. This example has an access to muon and jet containers, plus it reads goodrunlist. This is what was done:
1) We linked several ATLAS packages.
PACKAGE_DEP = EventLoop xAODRootAccess xAODEventInfo GoodRunsLists xAODJet xAODTrigger xAODEgamma JetSelectorTools JetResolution xAODMuon
in "cmt/Makefile.RootCore". We linked more packaged than needed to illustrate what can be linked. Use "rc version" to check their versions.
2) Then we modified 2 places to put histograms
MyAnalysis/MyxAODAnalysis.h # used to define pointers to histograms
Root/MyxAODAnalysis.cxx # initialized histograms and our analysis code that loops over jets and muons
The output of this example is in "submitDir/hist-sample.root". Open it with ROOT:
root
TBrowser a
and look at the histograms of jet and muon pT.
Read more: [[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODAnalysisInROOT#7_Creating_and_saving_histograms | Creating_and_saving_histograms]]
====== Lesson 5: Running a job on multiple cores ======
Now we run the above job on multiple cores of same computer and merge histogram outputs at the end.
The execution of this program does not use anything from ATLAS. It uses basic Linux commands.
Prepare a fresh directory:
source setup.sh
mkdir lesson_5; cd lesson_5
And setup the package:
rcSetup -u; rcSetup Base,2.0.12
rc find_packages # find needed packages
rc compile # compiles them
This takes some time to compile. Next we will us a simple example code that runs over multiple files located in some directory
curl http://atlaswww.hep.anl.gov/asc/xaod_tutor_oct2014/MyAnalysis_lesson5.tgz | tar -xz;
rc find_packages # find this package
rc compile # compiles it
cd MyAnalysis/util # go to the analysis code
For ANL people, the file "inputdata.txt" is already filled with xAOD data.
If you use another Tier3, you need to copy several xAOD files (as for the lesson 3), put them
to some directory "dir" and run
"python Make_input dir" which creates a new "inputdata.txt".
Also pay attention to the location of goodrunlist inside Root/MyxAODAnalysis.cxx. Correct the location
of the goodrunlist as you did for lesson 4.
We have made a few changes for in this lesson compared to lesson 4.
For example, we changed 'testRun.cxx' file and included a
small script "A_RUN" that launches several jobs in parallel. Open the file "A_RUN" and study it. What it does is this:
* Splits the file "inputdata.txt" using a given number of threads (2 is default) and put them to **inputs/** directory
* launches 2 jobs using input files from **inputs/**. Logfiles are also will be in that directory
* Output files will go to **outputs/** directory
The output will go to the directory "outputs", while input files and logfiles will be put to "inputs" directory.
Now let us launch 2 jobs using 2 cores. Each job will read a portion of the original "inputdata.txt". Do this:
./A_RUN
You can monitor jobs with this command (launched in a separate terminal):
ps rux | grep -e "testRun"
When the jobs are done, you can merge the output files to "hist.root":
hadd -f hist.root outputs/*/hist-run*.root
If it does not work: debug it as:
testRun 00
The command runs one job using the input list inputs/list00. It shows where the problem is.
(Typically, this is due to wrong location of the goodrunlist)
If you run this program second time, clean the output directory:
rm -rf outputs/*
(ROOTCORE does not like existing output directories)
**Attention:**
* Do not run more jobs than the number of available cores. Check the number of processing cores as "nproc"
* Typically, 6- 8 jobs are acceptable for disk IO using the standard hard drives. You can run more jobs when using SSD.
* Run such jobs when you know that other people are not affected by a heavy CPU load from multiple jobs
====== Lesson 6: Working on the ANL farm ======
This example is ANL-specific. If you ran analysis code for RUN I, you should be a familiar with it. We used [[http://atlaswww.hep.anl.gov/asc/arcond/ | Condor/Arcod system]] to submit jobs to the farm. Data are distributed on the farm nodes to avoid IO bottleneck.
This example works on the ANL Tier3 where Arcond/Condor is installed. If you do not have account at ANL, skip this lesson.
First, create a directory for this example:
Prepare a fresh directory and populate with the next example:
mkdir lesson_6;
cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_6/* lesson_6/
Then check that you can compile the analysis example using RootCore:
cd lesson_6/
source setup.sh
As usual, our analysis is in "MyAnalysis/util/testRun.cxx". Now we want to submit jobs to the data distributed on several computers of the farm. Go to the upper directory and setup the farm:
cd ../; source s_asc;
We will send jobs using the directory "submit". Go to this directory and check the data:
cd submit
arc_ls -s /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/ # shows short summary of data
arc_ls /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/ # list all files
The first command shows the summary of distributed data (12 files per server), while the second lists all available data on each node.
Now we will send the job to the farm.
Change the line:
package_dir=/users/chakanau/public/2014_tutorial_october_anl/lesson_6/ANLanalysis
inside "arcond.conf" to reflect the correct path to your program. Then run "arcond" and say "y" to all questions.
This sends jobs to the farm (2 jobs per server). Check the status as:
condor_q
When the jobs are done, the output files will be inside the "Job" directory. Merge the ROOT outputs into one file as:
arc_add
This will create the final output file "Analysis_all.root"
====== How to get this tutorial: ======
Now we need to get an example xAOD file. You can do all the checks above plus data for this tutorial as:
setupATLAS
diagnostics
setMeUp anl-oct2014
(this may fail on certain Tier3s)
====== Using ATLAS connect ======
This set of tutorials uses [[http://connect.usatlas.org/| ATLAS connect]]. Please go to the link
[[https://ci-connect.atlassian.net/wiki/display/AC/xAOD+analysis+tutorial| using ATLAS connect]].
All lessons discussed on this wiki have been adopted by Ilija Vukotic for the use with ATLAS connect.
See the instruction on how
{{:asc:tutorials:atlas_connect_handout_-_xaod_2014.pdf| to use ATLAS connect for this tutorial}}
======Using Eclipse to develop ATLAS code======
[[asc:tutorials:eclipse| here is the link to this tutorial]]
//[[chekanov@anl.gov|Sergei Chekanov (ANL)]] 2014/10/09 13:50//