====== xAOD tutorial at ANL using ATLAS connect ======
**This tutorial uses [[http://connect.usatlas.org/| ATLAS connect]]. All lessons discussed in [[asc:tutorials:2014october| ASC/ANL tutorial]] were adopted by Ilija Vukotic for the use with ATLAS connect.**
Please also look at the CERN tutorial [[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODAnalysisInROOT | SoftwareTutorialxAODAnalysisInROOT wiki]].
====== Getting started ======
First, we will setup ATLAS release.
1) Setup kinit as: **kinit username@CERN.CH** where username is your user name at CERN.
2) Create a bash script to setup ATLAS software:
#!/bin/bash
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh
setupATLAS
and then do "source setup.sh".
Setup FAX and ROOT so we can access data remotely:
localSetupFAX --rootVersion=current-SL6
Create the usual grid proxy. You will be prompted to enter your password.
voms-proxy-init -voms atlas
====== Lesson 1: Looking at xAOD ======
Let's us take a look at a typical xAOD file. You can open it with ROOT TBrowser as:
For ANL:
mkdir lesson_1; cd ./lesson_1
xrdcp $STORAGEPREFIX/atlas/rucio/valid2:AOD.01482225._000140.pool.root.1 AOD.01482225._000140.pool.root.1
root -l AOD.01482225._000140.pool.root.1
TBrowser a
Then click on the branch with the extension "Aux" (for example, "AntiKt4TruthJetsAux") and then plot "AntiKt4TruthJetsAux.pt"
See other detail in [[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODEDM#Browsing_the_xAOD_with_the_TBrow | CERN tutorial]]
====== Lesson 2: Using pyROOT to read xAOD ======
Now we will read the above line in Python and print some values for electron (eta and phi)
First, we setup our environment in the directory "lesson_1" that has the xAOD file:
source setup.sh
rcSetup -u; rcSetup Base,2.1.12
Now you can check what versions of packages are linked:
rc version
that shows a long version of packe versions.
The let's create a new directory and put this file:
#!/usr/bin/env python
# Set up ROOT and RootCore:
import ROOT
ROOT.gROOT.Macro( '$ROOTCOREDIR/scripts/load_packages.C' )
ROOT.xAOD.Init() # Initialize the xAOD infrastructure
fileName="AOD.01482225._000140.pool.root" # Set up the input files
treeName = "CollectionTree" # default when making transient tree anyway
f = ROOT.TFile.Open(fileName)
t = ROOT.xAOD.MakeTransientTree( f, treeName) # Make the "transient tree"
# Print some information:
print( "Number of input events: %s" % t.GetEntries() )
for entry in xrange( t.GetEntries() ):
t.GetEntry( entry )
print( "Processing run #%i, event #%i" % ( t.EventInfo.runNumber(), t.EventInfo.eventNumber() ) )
print( "Number of electrons: %i" % len( t.ElectronCollection ) )
for el in t.ElectronCollection: # loop over electron collection
print( " Electron trackParticle eta = %g, phi = %g" % ( el.trackParticle().eta(), el.trackParticle().phi() ) )
pass # end for loop over electron collection
pass # end loop over entries
f.Close()
If you copy this script, correct indentation (the lines should start from the column position 0). This is a good for your to learn this code!
Then run it as:
chmod +x xAODPythonMacro.py
./xAODPythonMacro.py
Using this code, one can fill histograms. But the code runs slow. Below we will show how to use C++/ROOT compiled code to
run over this file.
How will you find xAOD variables without using ROOT TBrowser? Try this code:
asetup 19.1.1.1,slc6,gcc47,64,here
checkSG.py AOD.01482225._000140.pool.root
You will see a table with the names of the variables.
====== Lesson 3: Analysis program to read xAOD ======
Now we will create a C++/ROOT analysis program and run over this input xAOD file. Do not forget to run "kinit username@CERN.CH".
Use the same setup file as above.
mkdir lesson3
cd lesson3
rcSetup -u; rcSetup Base,2.0.12
rc find_packages # find needed packages
rc compile # compiles them
This takes some time to compile. Next we will us a simple example code that runs over multiple files located in some directory
wget https://ci-connect.atlassian.net/wiki/download/attachments/10780723/MyAnalysis3.zip
unzip MyAnalysis3.zip
rc find_packages # find this package
rc compile # compiles it
go to the program starting directory and recreate an input files list (Since this dataset is rather large we suggest to remove all but a few files from the inputdata.txt.):
cd MyAnalysis/util # go to the analysis code
fax-get-gLFNs valid2.117050.PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534_tid01482225_00 > inputdata.txt
Your analysis is started using testRun.cxx. We pass “submitDir” which will be the output directory with ROOT file. You must delete it every time you run the code (or use different output). The code runs over a list of files inputdata.txt. The actual analysis should be put to “Root/MyxAODAnalysis.cxx” (called from testRun.cxx).
testRun submitDir # runs over all files inside inputdata.txt
The actual analysis should be put in "Root/MyxAODAnalysis.cxx".
====== Lesson 4: Filling histograms ======
Now we will make a number of changes to the above program. We will fill histograms with pT of jets and muons.
To do this, added the following changes: We have added:
PACKAGE_DEP = EventLoop xAODRootAccess xAODEventInfo GoodRunsLists xAODJet xAODTrigger xAODEgamma JetSelectorTools JetResolution xAODMuon
in "cmt/Makefile.RootCore". Then we have modified:
MyAnalysis/MyxAODAnalysis.h # added new pointers to histograms
Root/MyxAODAnalysis.cxx # initialized histograms and put loops over jets and muons
Now we will create a C++/ROOT analysis program and run over this input xAOD file.
mkdir lesson_4; cd lesson_4
rcSetup -u; rcSetup Base,2.0.12
rc find_packages
rc compile
wget https://ci-connect.atlassian.net/wiki/download/attachments/10780723/MyAnalysis4.zip
unzip MyAnalysis4.zip
rc find_packages # find this package
rc compile # compiles it
The actual analysis should is in "Root/MyxAODAnalysis.cxx".
Read more: [[https://twiki.cern.ch/twiki/bin/viewauth/AtlasComputing/SoftwareTutorialxAODAnalysisInROOT#7_Creating_and_saving_histograms | Creating_and_saving_histograms]]
Run it as:
cd MyAnalysis/util # go to the analysis code
fax-get-gLFNs valid2.117050.PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534_tid01482225_00 > inputdata.txt
testRun submitDir # runs over all files inside inputdata.txt
====== Lesson 5: Running on multiple cores ======
This example is not needed for ATLAS connect. If you still want to know how to run an ATLAS analysis job on several cores of your desktop,
look at [[asc:tutorials:2014october#lesson_5running_a_job_on_multiple_cores]]
====== Lesson 6: Using HTCondor and Tier2 ======
Lesson 5: Working on a Tier3 farm (Condor queue)
In this example we will use HTCondor workload management system to send the job to be executed in a queue at a Tier3 farm. For this example we will start from the directory lesson 4, so if you did not do the lesson 4 please do that one first and verify that your code runs locally.
Start from the new shell and set up environment, then create this shell script that will be executed at the beginning of each job at each farm node:
#!/bin/bash
export RUCIO_ACCOUNT=YOUR_CERN_USERNAME
export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh
localSetupFAX
source $AtlasSetup/scripts/asetup.sh 19.0.2.1,noTest
export X509_USER_PROXY=x509up_u21183
unzip payload.zip
ls
rcSetup -u; rcSetup Base,2.0.12
rc find_packages
rc compile
cd MyAnalysis/util
rm submitDir
echo $1
sed -n $1,$(($1+2))p inputdata.txt > Inp_$1.txt
cp Inp_$1.txt inputdata.txt
cat inputdata.txt
echo "startdate $(date)"
testRun submitDir
echo "enddate $(date)"
Make sure the RUCIO_ACCOUNT variable is properly set. Make this file executable and create the file that describes our job needs and that we will give to condor:
Jobs=10
getenv = False
executable = startJob.sh
output = MyAnal_$(Jobs).$(Process).out
error = MyAnal_$(Jobs).$(Process).error
log = MyAnal_$(Jobs).$(Process).log
arguments = $(Process) $(Jobs)
environment = "IFlist=$(IFlist)"
transfer_input_files = payload.zip,/tmp/x509up_u21183,MyAnalysis/util/inputdata.txt
universe = vanilla
#Requirements = HAS_CVMFS =?= True
queue $(Jobs)
To access files using FAX the jobs need a valid grid proxy. That's why we send it with each job. Proxy is the file starting with "x509up" so in both job.sub and startJob.sh you should change "x509up_u21183" with the name of your grid proxy file. The filename you may find in the environment variable $X509_USER_PROXY.
You need to pack all of the working directory into a payload.zip file:
startJob.sh
rc clean
rm -rf RootCoreBin
zip -r payload.zip *
Now you may submit your task for the execution and follow its status in this way:
chmod 755 ./startJob.sh; ./startJob.sh;
~> condor_submit job.sub
Submitting job(s)..........
10 job(s) submitted to cluster 49677.
~> condor_q ivukotic
-- Submitter: login.atlas.ci-connect.net : <192.170.227.199:60111> : login.atlas.ci-connect.net
ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD
49677.0 ivukotic 10/9 10:21 0+00:00:11 R 0 0.0 startJob.sh 0
49677.1 ivukotic 10/9 10:21 0+00:00:11 R 0 0.0 startJob.sh 1
49677.2 ivukotic 10/9 10:21 0+00:00:11 R 0 0.0 startJob.sh 2
49677.3 ivukotic 10/9 10:21 0+00:00:11 R 0 0.0 startJob.sh 3
49677.4 ivukotic 10/9 10:21 0+00:00:11 R 0 0.0 startJob.sh 4
49677.5 ivukotic 10/9 10:21 0+00:00:11 R 0 0.0 startJob.sh 5
49677.6 ivukotic 10/9 10:21 0+00:00:11 R 0 0.0 startJob.sh 6
49677.7 ivukotic 10/9 10:21 0+00:00:11 R 0 0.0 startJob.sh 7
49677.8 ivukotic 10/9 10:21 0+00:00:11 R 0 0.0 startJob.sh 8
49677.9 ivukotic 10/9 10:21 0+00:00:10 R 0 0.0 startJob.sh 9
10 jobs; 0 completed, 0 removed, 0 idle, 10 running, 0 held, 0 suspended