This is an old revision of the document!
Table of Contents
xAOD tutorial at ANL using ATLAS connect
These tutorial are based on CERN SoftwareTutorialxAODAnalysisInROOT wiki.
This tutorial uses ATLAS connect. All lessons discussed in ASC/ANL tutorial were adopted by Ilija Vukotic for the use with ATLAS connect.
Getting started
First, we will setup ATLAS release.
1) Setup kinit as: kinit [email protected] where username is your user name at CERN.
2) Create a bash script to setup ATLAS software:
- setup.sh
#!/bin/bash export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh setupATLAS
and then do “source setup.sh”. Setup FAX and ROOT so we can access data remotely:
localSetupFAX --rootVersion=current-SL6
Create the usual grid proxy. You will be prompted to enter your password.
voms-proxy-init -voms atlas
Lesson 1: Looking at xAOD
Let's us take a look at a typical xAOD file. You can open it with ROOT TBrowser as:
For ANL:
mkdir lesson_1; cd ./lesson_1 xrdcp $STORAGEPREFIX/atlas/rucio/valid2:AOD.01482225._000140.pool.root.1 AOD.01482225._000140.pool.root.1 root -l AOD.01482225._000140.pool.root.1 TBrowser a
Then click on the branch with the extension “Aux” (for example, “AntiKt4TruthJetsAux”) and then plot “AntiKt4TruthJetsAux.pt”
See other detail in CERN tutorial
Lesson 2: Using pyROOT to read xAOD
Now we will read the above line in Python and print some values for electron (eta and phi)
First, we setup our environment in the directory “lesson_1” that has the xAOD file:
source setup.sh rcSetup -u; rcSetup Base,2.1.11
Now you can check what versions of packages are linked:
rc version
that shows a long version of packe versions.
The let's create a new directory and put this file:
- "xAODPythonMacro.py
#!/usr/bin/env python # Set up ROOT and RootCore: import ROOT ROOT.gROOT.Macro( '$ROOTCOREDIR/scripts/load_packages.C' ) ROOT.xAOD.Init() # Initialize the xAOD infrastructure fileName="AOD.01482225._000140.pool.root" # Set up the input files treeName = "CollectionTree" # default when making transient tree anyway f = ROOT.TFile.Open(fileName) t = ROOT.xAOD.MakeTransientTree( f, treeName) # Make the "transient tree" # Print some information: print( "Number of input events: %s" % t.GetEntries() ) for entry in xrange( t.GetEntries() ): t.GetEntry( entry ) print( "Processing run #%i, event #%i" % ( t.EventInfo.runNumber(), t.EventInfo.eventNumber() ) ) print( "Number of electrons: %i" % len( t.ElectronCollection ) ) for el in t.ElectronCollection: # loop over electron collection print( " Electron trackParticle eta = %g, phi = %g" % ( el.trackParticle().eta(), el.trackParticle().phi() ) ) pass # end for loop over electron collection pass # end loop over entries f.Close()
If you copy this script, correct indentation (the lines should start from the column position 0). This is a good for your to learn this code! Then run it as:
chmod +x xAODPythonMacro.py ./xAODPythonMacro.py
Using this code, one can fill histograms. But the code runs slow. Below we will show how to use C++/ROOT compiled code to run over this file.
How will you find xAOD variables without using ROOT TBrowser? Try this code:
asetup 19.1.1.1,slc6,gcc47,64,here
checkSG.py AOD.01482225._000140.pool.root
You will see a table with the names of the variables.
Lesson 3: Analysis program to read xAOD
Now we will create a C++/ROOT analysis program and run over this input xAOD file. Do not forget to run “kinit [email protected]”. Use the same setup file as above.
mkdir lesson3 cd lesson3 rcSetup -u; rcSetup Base,2.0.12 rc find_packages # find needed packages rc compile # compiles them
This takes some time to compile. Next we will us a simple example code that runs over multiple files located in some directory
wget https://ci-connect.atlassian.net/wiki/download/attachments/10780723/MyAnalysis3.zip unzip MyAnalysis3.zip rc find_packages # find this package rc compile # compiles it
go to the program starting directory and recreate an input files list (Since this dataset is rather large we suggest to remove all but a few files from the inputdata.txt.):
cd MyAnalysis/util # go to the analysis code fax-get-gLFNs valid2.117050.PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534_tid01482225_00 > inputdata.txt
Your analysis is started using testRun.cxx. We pass “submitDir” which will be the output directory with ROOT file. You must delete it every time you run the code (or use different output). The code runs over a list of files inputdata.txt. The actual analysis should be put to “Root/MyxAODAnalysis.cxx” (called from testRun.cxx).
testRun submitDir # runs over all files inside inputdata.txt
The actual analysis should be put in “Root/MyxAODAnalysis.cxx”.
Lesson 4: Filling histograms
Now we will make a number of changes to the above program. We will fill histograms with pT of jets and muons. To do this, added the following changes:
In : cmt/Makefile.RootCore we have added:
PACKAGE_DEP = EventLoop xAODRootAccess xAODEventInfo GoodRunsLists xAODJet xAODTrigger xAODEgamma JetSelectorTools JetResolution xAODMuon
Then we have modified:
MyAnalysis/MyxAODAnalysis.h # added new pointers to histograms Root/MyxAODAnalysis.cxx # initialized histograms and put loops over jets and muons
Now we will create a C++/ROOT analysis program and run over this input xAOD file. Do not forget to run “kinit [email protected]”. Use the same setup file as above.
source setup.sh mkdir lesson_4; cd lesson_4
mkdir ANLanalysisHisto cd ANLanalysisHisto rcSetup -u; rcSetup Base,2.1.11 rc find_packages # find needed packages rc compile # compiles them
This takes some time to compile. Next we will us a simple example code that runs over multiple files located in some directory
cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_4/ANLanalysis/MyAnalysis MyAnalysis rc find_packages # find this package rc compile # compiles it cd MyAnalysis/util # go to the analysis code testRun submitDir # runs over all files inside inputdata.txt
How does this work? Your analysis code is testRun.cxx. We pass “submitDir” which will be the output directory with ROOT file. You must delete it every time you run the code (or use different output). The code runs over a list of files inputdata.txt. You can create a new list using the script:
python Make_input <directory with xAOD files>
The actual analysis should be put to “Root/MyxAODAnalysis.cxx”.
Lesson 5: Running a job on multiple cores
Now we run the above job on multiple codes of same computer and merge histogram outputs at the end.
Prepare a fresh directory:
source setup.sh mkdir lesson_5; cd lesson_5
And setup the package:
mkdir ANLanalysisHisto_threads cd ANLanalysisHisto_threads rcSetup -u; rcSetup Base,2.1.11 rc find_packages # find needed packages rc compile # compiles them
This takes some time to compile. Next we will us a simple example code that runs over multiple files located in some directory
cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_5/ANLanalysis_threads/MyAnalysis/* ANLanalysis_threads/MyAnalysis/ rc find_packages # find this package rc compile # compiles it cd MyAnalysis/util # go to the analysis code
This example has a modified. “util” directory compared to the previous example. For example, we changed 'testRun.cxx' file and included a small script “A_RUN” that launches several jobs in parallel. Open the file “A_RUN” and study it. What it does is this:
- Generates input list “inputdata.txt”
- Splits it using a given number of threads (2 is default) and put them to inputs/ directory
- launches 2 jobs using input files from inputs/. Logfiles are also will be in that directory
- Output files will go to outputs/ directory
The output will go to the directory “outputs”, while input files and logfiles will be put to “inputs” directory.
To run the job, run “./A_RUN”. You can monitor jobs with the command:
ps rux | grep -e "testRun"
Ones the jobs are done, you can merge the output files to “hist.root”:
hadd -f hist.root outputs/*/hist-run*.root
Lesson 6: Working on the ANL farm
This example is ANL-specific. If you ran analysis code for RUN I, you should be a familiar with it. We used Condor/Arcod system to submit jobs to the farm. Data are distributed on the farm nodes to avoid IO bottleneck.
First, create a directory for this example:
Prepare a fresh directory:
mkdir lesson_6
Copy the needed directories with RootCore example:
cp -r /users/chakanau/public/2014_tutorial_october_anl/lesson_6/* lesson_6/
Then check that you can compile the analysis example using RootCore:
cd lesson_6/ source setup.sh
As usual, our analysis is in “MyAnalysis/util/testRun.cxx”
Now we want to submit jobs to the data distributed on several computers of the farm. Go to the upper directory and setup the farm:
cd ../; source s_asc;
We will send jobs using the directory “submit”. Go to this directory and check the data:
cd submit arc_ls -s /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/ arc_ls /data2/valid2/117050/PowhegPythia_P2011C_ttbar.digit.AOD.e2657_s1933_s1964_r5534/
The first command shows the summary of distributed data (12 files per server), while the second lists all available data on each node. Now we will send the job to the farm. Change the line:
package_dir=/users/chakanau/public/2014_tutorial_october_anl/lesson_6/ANLanalysis
inside “arcond.conf” to reflect the correct path to your program. Then run “arcond” and say “y” to all questions. This sends jobs to the farm (2 jobs per server). Check the status as:
condor_q
When the jobs are done, the output files will be inside “Jobs” directory. Merge the ROOT outputs into one file as:
arc_add
This will create the final output file “Analysis_all.root”