Recent Changes - Search:

<< Back to ANL ACS

Home Page

Working Examples

PmWiki

edit SideBar

AthenaCodingTutorial

Basic Coding in Athena Tutorial

This tutorial is designed for users who are new to writing analysis code in the Athena environment. In this tutorial, the user will create a simple analysis code from scratch that produces and fills entries in an ntuple. This simple code will then be modified to perform additional tasks (eg. make histograms). As we go along, I will attempt to explain how the code works and how to go about finding the information online that is necessary to access athena libraries. It is hoped that after working through this tutorial, users will be better equipped to understand and modify more complicated analysis codes (like the AnalysisSkeleton example, or the PromptPhoton or HighETJets packages). Any suggestions for improvement are welcomed. (email creator at csuhr --- at --- niu.edu)

It is assumed that you have set up your account so you can access the ANL machines and run athena, root, etc.

In the following, commands to be typed at the command prompt are in red , while text to be copied to a file is in green. Example lines are given in purple .

Use the -X option when you ssh to one of the atlas machines so that root will be able to display graphs from the ntuple you will create. For example, to ssh to atlas16 you should type:

ssh -X atlas16.hep.anl.gov

Part 1: Creating your Package

The initial steps in writing an analysis code are presented at: https://twiki.cern.ch/twiki/bin/view/Atlas/WorkBookCreatingNewPackage The initial code presented here does more than just write a message to the screen, so it is quite a bit more complicated. I will attempt to explain its features as I go along.

To begin, go to the appropriate version directory (v 14.1.0 for this jamboree) and use the cmt create command to create a package.

cd testarea/14.1.0
cmt create Tutorial Tutorial-00-00-01

The package created contains the following directories:

  • cmt - contains files for compiling, cleanup, and setup scripts
  • src - contains the source code .cxx files

You will also create two additional directories and the files they contain.

  • Tutorial (or whatever the package name happens to be) - this contains the header files
  • share - contains the TutorialjobOptions.py file, which is run by athena and specifies the characteristics of the job, eg. data input files, variables that are read into source code, etc.

First, we will create the requirements file in the cmt directory. This file tells the package where to look for the athena libraries that are used in the code and usually declared with an include statement in the header file. I have not found a place where the connections between the various header files are connected with the appropriate lines for the requirements file, so usually one must check previous examples to find what works.

cd Tutorial/cmt

Create the file using an editor like emacs or vi (I'll use emacs in the example) and copying the following text into the file.

emacs requirements

#################################################
package Tutorial
author ANL ATLAS Analysis Center

use AtlasPolicy 		AtlasPolicy-01-*
use GaudiInterface 		GaudiInterface-01-* 		External
use AtlasCLHEP                  AtlasCLHEP-00-*                 External

use McParticleEvent             McParticleEvent-*               PhysicsAnalysis/TruthParticleID
use CompositeParticleEvent      CompositeParticleEvent-*        PhysicsAnalysis/AnalysisCommon
use UserAnalysisUtils           UserAnalysisUtils-*             PhysicsAnalysis/AnalysisCommon

# tools
use AnalysisTools               AnalysisTools-*                 PhysicsAnalysis/AnalysisCommon
use StoreGate                   StoreGate-*                     Control
use AtlasAnalysisRunTime        AtlasAnalysisRunTime-*

library Tutorial *.cxx -s=components *.cxx
apply_pattern component_library

apply_tag  ROOTBasicLibs
apply_tag  ROOTMathLibs
apply_tag  ROOTSTLDictLibs
apply_tag  ROOTGraphicsLibs
apply_tag  ROOTTableLibs
apply_pattern declare_joboptions files="TutorialjobOptions.py"
#################################################

Next, we will crate the Tutorial.cxx file in the src directory. This is the file that contains your primary algorithm in which your ntuple branches are defined, the data file read, and variables manipulated and written to the ntuple.

cd ../src
emacs Tutorial.cxx

Now copy the following text into the file you've created.

#include "Tutorial/Tutorial.h"

/////////////////////////////////////////////////////////////////////////////

///Constructor
MyAlg::MyAlg(const std::string& name, ISvcLocator* pSvcLocator) :
  CBNT_AthenaAwareBase(name, pSvcLocator),
  m_analysisTools("AnalysisTools",this)
{
  //Declare properties
  declareProperty("RecoJetContainer", m_RecoJetContName = "Kt6H1TopoJets");
}

//Destructor

MyAlg::~MyAlg() {}

//Initialize
StatusCode MyAlg::CBNT_initialize(){
  StatusCode sc;
  MsgStream log(messageService(), name());
  log << MSG::INFO << "initialize()" << endreq;

  // Initialize Services
  sc = service("StoreGateSvc", m_storeGate);
  if ( sc.isFailure() ) {
    log << MSG::ERROR << "Unable to retrieve pointer to StoreGateSvc"<< endreq;
    return sc;
  }
  // Initialize Tools
  sc = m_analysisTools.retrieve();
  if ( sc.isFailure() ) {
    log << MSG::ERROR << "Could not get handle on analysis tools!" << endreq;
    return sc;
  }

  //Initialize Branches
  addBranch(m_RecoJetContName+"JetPt", m_jet_pt);


  return StatusCode::SUCCESS;
}

//Finalize
StatusCode MyAlg::CBNT_finalize() {
    MsgStream log(messageService(), name());
    log << MSG::INFO << "finalize()" << endreq;
    return StatusCode::SUCCESS;
}

//Clear
StatusCode MyAlg::CBNT_clear() {

  m_jet_pt->clear();

  return StatusCode::SUCCESS;
}

//Execute
StatusCode MyAlg::CBNT_execute() {
    MsgStream log(msgSvc(), name());

    //
    StatusCode sc;
    //Jet Collection
    const JetCollection* m_jetTES;
    m_jetTES = 0;
    sc = m_storeGate->retrieve(m_jetTES, m_RecoJetContName);
    if(sc.isFailure() || !m_jetTES)
      log <<MSG::WARNING << "No Jets Found." << endreq;
    else
      log <<MSG::INFO << "Jets Retrieved." << endreq;

    //Define iterators over jet container
    JetCollection::const_iterator jetItr  = m_jetTES->begin();
    JetCollection::const_iterator jetItrE = m_jetTES->end();

    for (; jetItr != jetItrE; ++jetItr) {
      //Fill Branch
      m_jet_pt->push_back((*jetItr)->pt()/GeV);
    }


    return StatusCode::SUCCESS;
}

Code Walk Through

Here we explain the main features of the code in Tutorial.cxx, which contains the MyAlg algorithm, the principle algorithm of the analysis code. This is the algorithm called by the TutorialjobOptions.py file (I may just call it jobOptions to save space, but this is the same file) that is run by athena and any others methods that one might wish to define will be called from within MyAlg.

All athena code has three primary functions, initialize(), execute(), and finalize() (Often a clear() function is added as well). The function initialize() is run a single time to set up athena tools and variables used in the actual analysis. Execute() is then run for each event (so if you analyze n events you will run execute() n times). It is in this function that the actual analysis is done and data is manipulated and written to the output ntuple. The function clear() is called after each execute() in order to reset any variables that are used in execute(). Finalize() is run once at the end. In our example, not much is done with it, although one might want to use it to perform calculations with global variables after all events have been run over.

Our code's principle parts and their important features are as follows (these are labelbed by comments in the code):

The constructor: Creates an instance of the MyAlg algorithm. Multiple instances of an algorithm with different job options can be called from within the jobOptions.py file. Later, we will see how this is done. Within the constructor there are statements of the form
declareProperty("string", variable = VALUE );
These define variables that can be set in the jobOptions.py file (We will do this later). Should no value be specifed in the jobOptions file, the variable is set to VALUE as given in the line above.
CBNT_initialize: Initialize is run a single time before the analysis of events begins. Here, the various tools needed to access the data file are set up. First, the StoreGate is initialized, which allows access to the contents of the data file. The tool retrieve is then initialized, which allows one to open specific containers (eg. jets or electrons) and access their contents.
The branches of the output ntuple are also defined in initialize with statements of the form
addBranch("string", variable);
CBNT_execute: In execute we have code to access the JetCollection, pull out the transverse momentum, and write it to a variable in the output ntuple.
First, lets consider the line
%purple const JetCollection* m_jetTES;
which defines a pointer that we use to access the jet objects. We can name the pointer m_jetTES anything we want, but it is important that the name after const corresponds to the particular collection that we want to access (in this case defined in m_RecoJetContName). In this example the container name is "Kt6H1TopoJets" which are part of the JetCollection container class. If you want to access a different particle you will have to find the container name and the container class. We will have an example of this later.
The appropriate JetCollection is then obtained from the StoreGate, jetItr and jetItrE are defined and set as the beginning and end of the JetCollection respectively, and then these iterators are used to run over the jets and pull out the transverse momentum in the loop below. The default units for athena are eV, but by using the variable GeV (defined in a standard library included in our header file) we can set it to GeV.
CBNT_clear(): This resets m_jet_pt to be used for the next event.
CBNT_finalize(): Not much is done here.

Having considered the major features of the Tutorial.cxx, we will continue creating the files necessary for our job.

Now we will create an additional directory inside the src directory and create two files: Tutorial_entries.cxx and Tutorial_load.cxx. These files tell athena where to look for the algorithms called in the topOptions.py file.

mkdir components
cd components

Create the file Tutorial_entries.cxx.

emacs Tutorial_entries.cxx

Fill it with the following text.

#include "Tutorial/Tutorial.h"
#include "GaudiKernel/DeclareFactoryEntries.h"
DECLARE_ALGORITHM_FACTORY( MyAlg )
DECLARE_FACTORY_ENTRIES(Tutorial) {
DECLARE_ALGORITHM( MyAlg )
}

Here MyAlg is the main algorithm in the Tutorial.cxx file (or whatever you should happen to name your principle algorithm). Tutorial is the name of the package.

Now create the file Tutorial_load.cxx and include the following text.

#include "GaudiKernel/LoadFactoryEntries.h"
LOAD_FACTORY_ENTRIES(Tutorial)

Next, we will create the header file Tutorial.h. We will first make a new directory.

cd ../..
mkdir Tutorial
cd Tutorial
emacs Tutorial.h

Now, fill Tutorial.h with the following text

#ifndef MYALG_H
#define MYALG_H

//Tools - basic header files needed for most analysis
#include "GaudiKernel/ToolHandle.h"
#include "GaudiKernel/IToolSvc.h"
#include "GaudiKernel/Algorithm.h"
#include "GaudiKernel/MsgStream.h"
#include "GaudiKernel/ObjectVector.h"
#include "StoreGate/StoreGateSvc.h"
#include "StoreGate/DataHandle.h"
#include "CLHEP/Units/SystemOfUnits.h"
#include "AnalysisTools/AnalysisTools.h"
#include "CBNT_Utils/CBNT_AthenaAwareBase.h"
#include "GeneratorObjects/McEventCollection.h"

//Particles
#include "JetEvent/JetCollection.h"

// containers
#include <vector>
#include <string>
#include <map>

/////////////////////////////////////////////////////////////////////////////
class MyAlg:public CBNT_AthenaAwareBase  {
 public:

  MyAlg (const std::string& name, ISvcLocator* pSvcLocator);
  ~MyAlg();
  virtual StatusCode CBNT_initialize();
  virtual StatusCode CBNT_execute();
  virtual StatusCode CBNT_finalize();
  virtual StatusCode CBNT_clear();

 private:

  //Handles to services
  StoreGateSvc * m_storeGate;
  //Handles to tools
  ToolHandle<AnalysisTools>     m_analysisTools;

  //Particle Containers
  //--Jets
  std::string m_RecoJetContName;

  // Variables for TBranches
  //--Jets
  std::vector<double> * m_jet_pt;

};
#endif 

Now we will create the TutorialjobOptions.py file. This is the file that is actually run by athena and then calls the algorithm in the source code. A number of options can be specified within this file, such as the input file, the number of events to run over, and container names for various ntuple objects.

cd ..
mkdir share
cd share
emacs TutorialjobOptions.py

Fill it with this text.

#--------------------------------------------------------------
# TutorialjobOptions.py
#--------------------------------------------------------------


#--------------------------------------------------------------
#Configure job here
EvtMax = 10;
#Input File
InputFile = "/data/nas2/users/ryoshida/fdr08_run2/fdr08_run2.0052293.physics_Egamma.merge.AOD.o3_f8_m10/fdr08_run2.0052293.physics_Egamma.merge.AOD.o3_f8_m10._0001.1"
# File for histograms and ntuple
OutputFile="Tutorial.root"

# Particle Properties
include( "PartPropSvc/PartPropSvc.py" )
from AthenaCommon.Constants import *
from AthenaCommon.AppMgr import theApp
from AthenaCommon.AppMgr import ServiceMgr
import AthenaPoolCnvSvc.ReadAthenaPool

# The AOD input file
ServiceMgr.EventSelector.InputCollections =[ InputFile ]
ServiceMgr.EventSelector.BackNavigation = True

from RecExConfig.RecFlags  import rec
rec.readAOD=True
rec.doWriteAOD=False
rec.doWriteESD=False

from AthenaCommon.AlgSequence import AlgSequence
topSequence = AlgSequence()

# Athena-Aware NTuple making Tools
CBNTAthenaAware = True
include ("CBNT_Athena/CBNT_AthenaAware_jobOptions.py")
include ("CBNT_Athena/CBNT_EventInfo_jobOptions.py")

# Add top algorithms to be run
from Tutorial.TutorialConf import MyAlg

MyAlg1 = MyAlg("MyAlg1")
topSequence.CBNT_AthenaAware +=  MyAlg1
# JetAlg Options
#
#End Options

# Number of events to be processed (default is 10)
theApp.EvtMax = EvtMax

##########################################
# setup TTree registration Service
# save ROOT histograms and Tuple
from GaudiSvc.GaudiSvcConf import THistSvc
ServiceMgr += THistSvc()
exec 'ServiceMgr.THistSvc.Output = ["AANT DATAFILE=\'%(file)s\' OPT=\'RECREATE\'"] ' % { 'file': OutputFile }
from AnalysisTools.AnalysisToolsConf import AANTupleStream
topSequence += AANTupleStream()
AANTupleStream.ExtraRefNames = [ "StreamESD","Stream1" ]
AANTupleStream.OutputName = OutputFile
AANTupleStream.WriteInputDataHeader = True
AANTupleStream.OutputLevel = WARNING

# Set output level threshold (DEBUG, INFO, WARNING, ERROR, FATAL)
MessageSvc = Service( "MessageSvc" )
MessageSvc.OutputLevel = INFO

Part 2: Running the Code

Now that we are done creating the necessary files, we are ready to compile and run the program.

cd ../cmt
cmt config
source setup.sh
cmt broadcast gmake

Compiling will take a few minutes. Once it is done, cd to the share directory and run athena using the TutorialjobOptions.py file.

cd ../share
athena.py TutorialjobOptions.py > output.log

The last part of that command (> output.log) is optional and writes the messages produced while the program runs to a file called output.log. This makes it easier to search through the program output after it has run.

As the code runs, you will likely receive a few warning messages. These can be ignored. After it is done running, open the output.log file and go to the bottom. The final line should include "leaving with code 0: "successful run"". If this is the case, your code has run and produced an ntuple. To read this ntuple start root and open a new TBrowser.

root -l

And at the root prompt type:

new TBrowser

A new TBrowser will open and you should browse to the file Tutorial.root. Click through until you get a folder named "Collecton Tree". This contains the branch you crated. Click to open it.

A number of branches are created by default (eg. BCID, Event), but the one you created is titled Kt6H1TopoJetsPt. Click on this branch to see a histogram of the results.

Part 3: Modifying TutorialjobOptions.py

By modifying the TutorialjobOptions.py file it is possible to change many of the parameters of your athena job without recompiling your code.

Some of these are very simple. For example, the code above only runs over 10 events. If we wanted to run over more, we could change the number after EvtMax to something else. If we change it to -1 athena will run over all events in the file.

We can also change the level of messages that are output to either the screen or our output.log file. These message statements have the form:

log<<MSG::INFO<<"Stuff you want to output"<<endreq;

The levels of message are DEBUG, INFO, WARNING, ERROR, and FATAL. The message output level can be set toward the end of the TutorialjobOptions.py file by changing the line that reads MessageSvc.OutputLevel = INFO. Replace the INFO with whichever output level you desire. Less stringent message output levels include messages of all more stringent types. For example, if you set the level at INFO, messages marked WARNING, ERROR, and FATAL will also be printed, but if you set it at FATAL only those messages will be printed.

The input file can also be changed by changing the path and filename specified after the InputFile. More than one input file can be specified as well. To do this, the proper syntax is:

ServiceMgr.EventSelector.InputCollections =["filename1", "filename2",etc ]

Where "filenameX" gives the whole path to the file. Currently the brackets are filled with the variable InputFile, which is specified toward the top of the options file.

For the moment, we will leave all these options be and run another instance of the algorithm MyAlg over the same 10 events, but this time we will use a different JetCollection. Look at the following lines in the options file.

MyAlg1 = MyAlg("MyAlg1")
topSequence.CBNT_AthenaAware += MyAlg1

These lines tell athena to run the MyAlg algorithm, this particular instance of the algorithm being named "MyAlg1". We can run the MyAlg algorithm another time by adding the following additional code directly below it.

MyAlg2 = MyAlg("MyAlg2")
topSequence.CBNT_AthenaAware += MyAlg2
MyAlg2.RecoJetContainer = "Cone7H1TopoJets"

This runs another instance of MyAlg, but this time, rather than leave the default "Kt6H1TopoJets" as the container we have specified that the "Cone7H1TopoJets" container be used. The syntax used is InstanceName.VariableName = Value. Here InstanceName is the name given to the instance of the algorithm which is specified two lines above. VariableName is the same as the string in brackets in the corresponding declareProperty statement in the file Tutorial.cxx. Value can be whatever the information you want read in. It might be a number if one was specifying a value for a cut of some kind, but in this case it is a string.

Now we will run the program again. There is no need to recompile since we have only changed the job options file, so you need only cd to the share directory and type:

athena.py TutorialjobOptions.py > output.log

After it has run, open a TBrowser and examine the contents of the ntuple. There should now be a branch named "Cone7H1TopoJetsJetPt" in addition to "Kt6H1TopoJetsJetPt".

Part 4: Adding a New Options Variable

By new options variable, I mean one that can be accessed via the jobOptions file as we did with the Jet Container name. Here we will add a variable to perform a cut on transverse momentum, so only those jets with a transverse momentum above our cut value will be written to our ntuple.

First, open the Tutorial.h file and insert the following line under the heading "Variables for TBranches":

double m_jetptcut;

Now, open the Tutorial.cxx file and insert a delclareProperty statement for the variable:

declareProperty("JetEtaCut", m_jetptcut = 50.0);

The default value is set to 50.0 GeV, but we can change this in the jobOptions file if we wish. After the constructor has been called in the jobOptiong file (eg. a line like MyAlg1 = MyAlg("MyAlg1")), you can insert a line like:

MyAlg1.JetPtCut = 60

We must also add code to Tutorial.cxx to implement our cut. The line that writes the Pt needs to be contained within a loop as follows:

if ((*jetItr)->pt()/GeV > m_jetptcut) {
	m_jet_pt->push_back((*jetItr)->pt()/GeV);
      }

Now we are ready to recompile. The procedure will be a bit different since we have changed the header file. To recompile, first cd to the cmt directory and then type the following commands:

make clean
cmt config
source setup.sh
cmt broadcast gmake

Often, if one makes changes to only the .cxx files, simply using gmake will often work. If one has changed the header files this will not always work, so make clean should be used.

After it is done compiling you can run it with the same commands as before. Your output ntuple should have far fewer jets, since all those with low transverse momentum have been eliminated.

Part 5: Adding a New Particle Container

We will now add an additional Collection to our code so that we can access information about electrons. The first thing we must do is find the correct container class and the container name. The place to start is here. Since we are using AOD, the proper container class is ElectronContainer and the proper container name is "ElectronAODContainer".

We must also find the appropriate header file that allows us to access the electron container objects as JetCollection.h allows us to access jets. I have not found a place where the container and the header files are all listed in a nice logical list, so you have to do a bit of hunting. The best place to do this is in the LXR. I searched in version 14 for "ElectronContainer" and eventually I found a file that included in its declarations the header I am looking for:

%green% #include "egammaEvent/ElectronContainer.h" 

Open Tutorial.h and add this line to the include statements near the top.

We will also want to add a variable for the container name, a vector for the transverse momentum values, and a double for a cut on transverse momentum, as we did for the jets. So add the following lines to your Tutorial.h file as well.

std::string m_ElecContName;
std::vector<double> * m_elec_pt;
double m_elecptcut;

Now open Tutorial.cxx. The first thing we will do is add the declareProperty statements in the constructor.

declareProperty("ElectronContainer", m_ElecContName = "ElectronAODCollection");
declareProperty("ElecPtCut", m_elecptcut = 50.0);

Now we must add a branch to the ntuple. In CBNT_initialize(), after statement that declares the branch for jets, add the following:

addBranch(m_ElecContName+"Pt", m_elec_pt);

And to CBNT_clear() add a statement clearing m_elec_pt for run of execute:

m_elec_pt->clear();

Finally, in CBNT_execute() you must add the code that actually accesses the electron containers from StoreGate, pulls out the transverse momentum, and writes it to the vector we created.

const ElectronContainer* m_elecTES;
    m_elecTES = 0;
    sc = m_storeGate->retrieve(m_elecTES, m_ElecContName);
    if(sc.isFailure() || !m_elecTES)
      log <<MSG::WARNING << "No Electrons Found." << endreq;
    else
      log <<MSG::INFO << "Electrons Retrieved." << endreq;

    ElectronContainer::const_iterator elecItr = m_elecTES->begin();
    ElectronContainer::const_iterator elecItrE = m_elecTES->end();

    for (; elecItr != elecItrE; ++elecItr) {
      if((*elecItr)->pt()/GeV > m_elecptcut) {
	m_elec_pt->push_back((*elecItr)->pt()/GeV);
      }
    }

Recompile, using the make clean and cmt broadcast make commands since we have made substantial changes to the header file, and run the athena job. If you pull up a TBrowser and check Tutorial.root you should see a branch for the electrons. If you haven't modified your jobOptions.py file since Part 2, you will have two identical branches, since the MyAlg algorithm ran twice.

Part 6: Histograms

Edit - History - Print - Recent Changes - Search
Page last modified on July 14, 2008, at 01:37 PM