[[community:hepsim:|<< back to HepSim manual]]
====== HepSim Analysis Primer ======
This analysis tutorial covers Python codding on the Java platform (Jython). Look at the [[http://atlaswww.hep.anl.gov/hepsim/description.php|main description]] for other topics.
For C++/ROOT, please refer [[asc:promc| ProMC web page]].
To ensure platform independence and a possibility to run programs using web browsers [(Cross-platform validation and analysis environment for particle physics, S.V. Chekanov, I. Pogrebnyak, D. Wilbern, arXiv:1510.06638 http://arxiv.org/abs/1510.06638)], validation programs are written in Jython, which is an implementation of the Python language on the Java platform.
Look at several links:
* [[http://www.jython.org/jythonbook/en/1.0/| Jython book]]
* [[http://jwork.org/dmelt/wikidoc/doku.php?id=start| DMelt manual (community version)]]
The example codes on this page can be run using several options:
* Using Java web start from the browser.
* "hs-ide file.py" command to launch the editor (from the hs-tools package)
* "hs-run file.py" command to run using a command-line ("batch mode", from the hs-tools package)
* [[http://atlaswww.hep.anl.gov/asc/jas4pp/| Jas4pp]]
* [[http://jwork.org/dmelt/ |DMelt IDE]]
====== Accessing data ======
Here is a simple example to make a list of files using the data URL:
from hepsim import HepSim
url="http://mc.hep.anl.gov/asc/hepsim/events/pp/100tev/higgs_pythia8/"
flist=HepSim.getList(url)
print flist
If you do not know URL, you can use the dataset name or URL of the info page of this dataset:
from hepsim import HepSim
url=HepSim.urlRedirector("tev100_ttbar_mg5")
flist=HepSim.getList(url)
print flist
Note that you can find URL list of datasets using the standard Python.
name="tev100_pythia8_minbias_a14"
hepsim_server="http://atlaswww.hep.anl.gov/hepsim/"
import urllib2
response = urllib2.urlopen(hepsim_server+'/geturlmirrors.php?name='+name)
html = response.read()
response.close()
html=html.split(";")
print html # print a list of URL locations
If ProMC files were downloaded to the local file system, one can build the file list as:
from jhplot.utils import FileList
flist=FileList.get("./data","promc")
print flist
Here "./data" is a directory with files, while "promc" indicated that we build the file list using the file extensions ".promc".
Often, the output list contains multiple files generated with different pT or mass ranges. For example, a group of files generated with minimum pT=1000 GeV have names *_pt1000_*.py, while files generated with pt=5000 GeV have names *_pt5000_*.py. One can build a list with pT ranges using this method:
xmin=HepSim.getRanges(flist,"_pt","_");
This methods takes file list, and identify pT ranges given between the strings "_pt" and "_". In the above example, the output list will be [1000,5000].
====== Reading the file header ======
To read data from a ProMC file, you will need these statements:
from proto import FileMC
f=FileMC("data.promc") # get input file from current directory (or URL)
des=f.getDescription() # get description
header = f.getHeader() # to access information on this file
un=float(header.getMomentumUnit()) # conversion energy units
lunit=float(header.getLengthUnit()) # conversion length units
The latter 2 values are needed to convert from integer presentation of 4-momenta, typically used for variable byte encoding to save space,
to double values. The values are typically around 1000-10000, depending on importance of representing low-momentum with a better precision.
The header file also stores some additional information. Look at Java API
* [[http://atlaswww.hep.anl.gov/asc/promc/hepsim/doc/api/ | HepSim API]] for FileMC
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMCHeaderFile.ProMCHeader.html | ProMCHeader]] for MC records
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMCHeaderFile.ProMCHeader.html | ProMCHeader]] for NLO records
====== Reading the event records ======
from proto import FileMC
f=FileMC("data.promc") # get input file from current directory (or URL)
header = f.getHeader() # to access information on this file
un=float(header.getMomentumUnit()) # conversion energy units
lunit=float(header.getLengthUnit()) # conversion length units
for i in range(f.size()): # run over all events in the file
eve = f.read(i) # ProMCEvent record for event "i"
pa = eve.getParticles() # particle information
Look at Java API of the ProMCEvent class:
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMC.ProMCEvent.Event.html | Event ]] for MC
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMC.ProMCEvent.Event.html | Event ]] for NLO
Here is the Java API for "Particles":
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMC.ProMCEvent.Particles.html | Particles]] for MC
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMC.ProMCEvent.Particles.html | Particles]] for NLO
In the above example, "pa" is an object [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMC.ProMCEvent.Particles.html | Particles]] with particle information. We can create a typical Lorentz particle using the class [[http://jwork.org/dmelt/api/doc.php/hephysics/particle/LParticle|LParticle]] as:
from proto import FileMC
from hephysics.particle import LParticle
f=FileMC("data.promc") # get input file from current directory (or URL)
header = f.getHeader() # to access information on this file
un=float(header.getMomentumUnit()) # conversion energy units
lunit=float(header.getLengthUnit()) # conversion length units
for i in range(f.size()): # run over all events in the file
eve = f.read(i) # ProMCEvent record for event "i"
pa = eve.getParticles() # particle information
for j in range(pa.getPxCount()):
px=pa.getPx(j)/un # pX
py=pa.getPy(j)/un # pY
pz=pa.getPz(j)/un # pZ
e= pa.getEnergy(j)/un # energy
m= pa.getMass(j)/un # mass
p=LParticle(px,py,pz,e,m) # fill it
To access the PDG status code, us "pa.getPdgId(j) method. For stable particles, use the HEPEVT status code (the method "pa.getStatus(j)").
Here are the methods to retrieve information from the event records on a particle at a position j:
* **pa.getPdgId(j)** - get PDG id (int)
* **pa.getStatus(j)** - get status information (int)
* **pa.getPx(j), pa.getPy(j), pa.getPz(j), pa.getEnergy(j)** - get Px, Py, Pz, Energy of a particle (in int64 varint)
* **pa.getX(j), pa.getY(j), pa.getZ(j), pa.getT(j)** - get (x, y, z, t) position of a particle (in int64 varint)
* **pa.getMass(j)** - get mass (in int64 varint)
* **pa.getMother1(j), pa.getMother2(j)** - get positions of first and second parent of the particle (int)
* **pa.getDaughter1(j), pa.getDaughter1(j)** - get position of 1st and 2nd daughter (int)
* **pa.getBarcode(j)** - get barcode (int, in filled)
Look at the API of the Lorentz-particle class:
* [[http://jwork.org/dmelt/api/doc.php/hephysics/particle/LParticle|LParticle]] - a simple Lorentz particle with transformations
* [[http://jwork.org/dmelt/api/doc.php/hephysics/particle/HEParticle|HEParticle]] - a typical HEP particle based on the Lorentz particle with transformations
In addition, you can build an ArrayList with particles using a convenient class [[http://jwork.org/dmelt/api/doc.php/hephysics/hepsim/PromcUtil|PromcUtil]].
Also, when possible, use the method "getParticle()" that returns a [[http://jwork.org/dmelt/api/doc.php/hephysics/particle/HEParticle|HEParticle]] particle with 4-momenta and postilion.
from proto import FileMC
from hephysics.hepsim import PromcUtil
f=FileMC("data.promc") # get input file from current directory (or URL)
header = f.getHeader() # to access information on this file
un=float(header.getMomentumUnit()) # conversion energy units
lunit=float(header.getLengthUnit()) # conversion length units
for i in range(f.size()): # run over all events in the file
eve = f.read(i) # ProMCEvent record for event "i"
pa = eve.getParticles() # particle information
par=PromcUtil.getParticleDList(f.getHeader(), pa, 1, -1, 1000) # List with particle 4-momenta
print "Nr of final-state particles=",par.size()
The above example fills a list with all stable particles, without any cuts on pT and Eta (2 second arguments).
The next example creates [[http://jwork.org/dmelt/api/doc.php/hephysics/particle/HEParticle|HEParticle]] using [[http://jwork.org/dmelt/api/doc.php/hephysics/hepsim/PromcUtil|PromcUtil]]:
from proto import FileMC
from hephysics.hepsim import PromcUtil
f=FileMC("data.promc") # get input file from current directory (or URL)
header = f.getHeader() # to access information on this file
un=float(header.getMomentumUnit()) # conversion energy units
lunit=float(header.getLengthUnit()) # conversion length units
for i in range(f.size()): # run over all events in the file
eve = f.read(i) # ProMCEvent record for event "i"
pa = eve.getParticles() # particle information
for j in range(pa.getPxCount()):
particle=PromcUtil.getParticle(pa, un, lunit, j) # a HEP particle
In this example, we fill a list with all stable particles (status "1"), without any pT cut ("-1") and maximum absolute value of pseudo-rapidity 1000.
The object "par" is a list with [[ http://jwork.org/dmelt/api/doc.php/hephysics/jet/ParticleD|ParticleD]].
Finality, let us extend the above example: we will create anti-kT jets using the list "par":
from proto import FileMC
from hephysics.hepsim import PromcUtil
from hephysics.jet import JetN2
ktjet=JetN2(0.5,"antikt",20) # initialize jets with R=0.5, E-mode, anti-KT,min pT=20
print ktjet.info() # print its settings
f=FileMC("data.promc") # get input file from current directory (or URL)
header = f.getHeader() # to access information on this file
un=float(header.getMomentumUnit()) # conversion energy units
lunit=float(header.getLengthUnit()) # conversion length units
for i in range(f.size()): # run over all events in the file
eve = f.read(i) # ProMCEvent record for event "i"
pa = eve.getParticles() # particle information
par=PromcUtil.getParticleDList(f.getHeader(), pa, 1, -1, 1000) # List with particle 4-momenta
ktjet.buildJets(par) # build anti-kT jets
jets=ktjet.getJetsSorted() # get a list with sorted jets
if (len(jets)>0):
print "pT of a leading jet =",jets[0].perp()," GeV"
To build anti-kT jets, we use [[http://jwork.org/dmelt/api/doc.php/hephysics/jet/JetN2|JetN2 clustering algorithm]] implemented in Java.
The description of this algorithm is given in [[http://arxiv.org/abs/1510.06638]].
It is a similar to FastJet N*N algorithm and typically shows
a performance similar to the C++ analogue. Tests indicate that the JetN2 clustering algorithm is about twice slower that C++, but by a factor 20 faster than the traditional N^3 jet clustering algorithms. You can find benchmarking results in [[https://github.com/chekanov/hepjet/|HepJet]] github repository.
Look at a typical example of jet clustering in the [[http://atlaswww.hep.anl.gov/hepsim/code.php?item=88&code=qcd_pythia8_ptbins_jets.py| qcd_pythia8_ptbins_jets.py code]].
====== Event statistics ======
Before closing ProMC file, you may need to access event statistics, cross sections and errors on cross sections.
from proto import FileMC
f=FileMC("data.promc") # get input file from current directory (or URL)
stat = f.getStat() # get statistics
cross=stat.getCrossSectionAccumulated() # accumulated cross section in pb
error=stat.getCrossSectionErrorAccumulated() # error on accumulated cross section in pb
print "Cross section:", cross, "+/-", error, " pb"
f.close()
Look at the API:
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMCStatFile.ProMCStat.html | ProMCStat]] for MC
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMCStatFile.ProMCStat.html | ProMCStat]] for NLO
====== Plots and histograms ======
HepSim uses histogram packages supported by DataMelt (and implemented using JAIDA and FreeHep).
Here are a few most common classes:
* [[http://jwork.org/dmelt/api/doc.php/jhplot/HPlot|HPlot]] - simple canvas to show graphs
* [[http://jwork.org/dmelt/api/doc.php/jhplot/H1D|H1D]] - 1D histogram
* [[http://jwork.org/dmelt/api/doc.php/jhplot/H1D|H2D]] - 2D histogram
* [[http://jwork.org/dmelt/api/doc.php/jhplot/P1D|P1D]] - data container with support of errors
The main canvas to show histograms [[http://jwork.org/dmelt/api/doc.php/jhplot/H1D|H1D]] and data points [[http://jwork.org/dmelt/api/doc.php/jhplot/P1D|P1D]] is [[http://jwork.org/dmelt/api/doc.php/jhplot/HPlot|HPlot]]. To process scripts in a background without a pop-up [[http://jwork.org/dmelt/api/doc.php/jhplot/HPlot|HPlot]], use method "visible(False)", and set sys.exit(0) at the end of the scripts.
Usually, plots are saved in the vector-graphics SVG format. They can be converted on Linux to EPS as:
inkscape --without-gui --export-eps output.eps input.svg
However, one can also save images in PDF or EPS using the corresponding file extensions for the "export" command.
In addition, data are saved in the form of XML (with the extension ".jdat") files. Look the manual [[http://jwork.org/dmelt/wikidoc/doku.php?id=man:io:crossplatform|DatMelt IO]].
For example, one can read such files and create histograms or data objects as:
from jhplot.io import HBook
hb = HBook("http://mc.hep.anl.gov/asc/hepsim/events/pp/100tev/ttbar_mg5/macros/ttbar_mg5.jdat")
print hb.getKeys() # print all the keys
p1=hb.get("1")
print p1.toString()
The object p1 belongs to the class [[http://jwork.org/dmelt/api/doc.php/jhplot/P1D|P1D]]. Analogously, one can store histograms etc.
====== HepSim API ======
All HepSim Jython/Java validation scripts are based on a few classes. With such classes, you can make any observable from the particle lists
stored inside ProMC data files:
===== Histograms and plots =====
* [[http://jwork.org/dmelt/api/doc.php/jhplot/HPlot|HPlot]] - canvas to show X-Y data and histograms in 2D
* [[http://jwork.org/dmelt/api/doc.php/jhplot/HPlot|HPlot3D]] - canvas to show X-Y-Z data and histograms in 3D
* [[http://jwork.org/dmelt/api/doc.php/jhplot/H1D|H1D]] - 1D histogram
* [[http://jwork.org/dmelt/api/doc.php/jhplot/H1D|H2D]] - 2D histogram
* [[http://jwork.org/dmelt/api/doc.php/jhplot/P1D|P1D]] - X-Y container with support of 2-level errors
See also other math and graphics classes in the [[http://jwork.org/dmelt/api/doc.php/jhplot/package-summary | jhplot package]].
===== Lorentz particles and Jets =====
* [[http://jwork.org/dmelt/api/doc.php/hephysics/hepsim/PromcUtil|PromcUtil]] convenient method to fill arrays with particles from ProMC files
* [[http://jwork.org/dmelt/api/doc.php/hephysics/particle/LParticle|LParticle]] a HEP particle with the Lorentz transformations
* [[http://jwork.org/dmelt/api/doc.php/hephysics/jet/FastParticle|FastParticle]] a HEP particles with precomputed Et2,Eta,Phi for jet algorithms.
* [[http://java.freehep.org/freehep-physics/apidocs/hep/physics/vec/package-summary.html|Physics vectors]] typical HEP physics vectors the Lorentz transformations
* [[http://jwork.org/dmelt/api/doc.php/hephysics/jet/JetN2|JetN2]] recommended kt-type jet clustering ng algorithms (kT, anti-kT, CA) implemented in Java using N^2 approach, similar to the FastJet algorithm. Recommended for hadron collisions
* [[http://jwork.org/dmelt/api/doc.php/hephysics/jet/SCJet|SCJet]] traditional kt-type jet clustering algorithms (kT, anti-kT) for pp implemented in Java using N^3 approach (slow)
* [[http://java.freehep.org/freehep-physics/apidocs/hep/physics/jet/package-summary.html|Jets and event shapes]] traditional jet algorithms and event shapes for e+e- (Geneva, Jade, Durham jets)
Java implementation of the longitudinally invariant kT and anti-kT clustering algorithms uses the E-scheme to combine particles (p1+p2) and Eta-Phi space (not Rapidity-Phi). Also, Cambridge/Aachen jets are supported. [[https://github.com/chekanov/hepjet/|HepJet on github]] shows benchmarks of the Java implementation of the anti-KT with the original FastJet package.
You can build the standard kt-jets using [[http://jwork.org/dmelt/api/doc.php/hephysics/hepsim/PromcUtil | PromcUtil]], followed by the
[[http://jwork.org/dmelt/api/doc.php/hephysics/jet/KTjet | KTjet]] class.
===== Working with ProMC files =====
Also, look at the ProMC Java API that is used to store data:
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMCHeaderFile.ProMCHeader.html | ProMCHeader]] for MC and [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMCHeaderFile.ProMCHeader.html | ProMCHeader]] for NLO
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMC.ProMCEvent.Event.html | Event ]] for MC and [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMC.ProMCEvent.Event.html | Event ]] for NLO
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMC.ProMCEvent.Particles.html | Particles]] for MC and [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMC.ProMCEvent.Particles.html | Particles]] for NLO
* [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMCStatFile.ProMCStat.html | ProMCStat]] for MC and [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMCStatFile.ProMCStat.html | ProMCStat]] for NLO
It is simplified compared to full MC simulations, and also includes additional arrays "idata" and "fdata" to store
information on event weights and uncertainties.
The project uses the community edition of [[http://jwork.org/dmelt/|DMelt]] project and the FreeHEP Java Libraries.
--- //[[chekanov@anl.gov|Sergei Chekanov]] 2015/03/06 08:49//