User Tools

Site Tools


hepsim:usage_analysis

Differences

This shows you the differences between two versions of the page.


hepsim:usage_analysis [2024/07/01 21:25] (current) – created - external edit 127.0.0.1
Line 1: Line 1:
 +{{indexmenu_n>4}}
  
 +[[:|<< back to HepSim manual]]
 +
 +====== Analysis Primer ======
 + 
 +
 +This analysis tutorial covers Python codding on the Java platform (Jython). Look at the [[https://atlaswww.hep.anl.gov/hepsim/description.php|main description]] for other topics.
 +For C++/ROOT, please refer [[https://atlaswww.hep.anl.gov/asc/promc/| ProMC web page]].
 +
 +To ensure platform independence and a possibility to run programs using web browsers [(Cross-platform validation and analysis environment for particle physics, S.V. Chekanov, I. Pogrebnyak, D. Wilbern, arXiv:1510.06638 http://arxiv.org/abs/1510.06638)], validation programs are written in Jython, which is an implementation of the Python language on the Java platform.
 +Look at several links:
 +
 +  * [[http://www.jython.org/jythonbook/en/1.0/| Jython book]]
 +  * [[https://datamelt.org/| DataMelt (community version)]]
 +  * [[https://github.com/gavalian/groot/wiki| GRoot]]
 +
 +See [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/|Java code API]].
 +
 +The example codes on this page can be run using several options:
 +
 +  *  Using Java web start from the browser.
 +  * "hs-ide file.py" command to launch the editor (from [[https://atlaswww.hep.anl.gov/hepsim/doc/doku.php?id=hepsim:hs-tools|hs-tools]] package)
 +  * "hs-run file.py" command to run using a command-line ("batch mode", from the [[https://atlaswww.hep.anl.gov/hepsim/doc/doku.php?id=hepsim:hs-tools|hs-tools]] package)
 +  * [[https://atlaswww.hep.anl.gov/asc/jas4pp/| Jas4pp]]
 +  * [[https://datamelt.org/ |DataMelt  IDE]] 
 +
 +
 +====== Accessing data ======
 +
 +Here is a simple example to make a list of files using the data URL:
 +
 +<code python>
 +from hepsim import HepSim
 +url="https://mc.hep.anl.gov/asc/hepsim/events/pp/100tev/higgs_pythia8/"
 +flist=HepSim.getList(url)
 +print flist
 +</code>
 +
 +If you do not know URL, you can use the dataset name or URL of the info page of this dataset:
 +
 +<code python>
 +from hepsim import HepSim
 +url=HepSim.urlRedirector("tev100_ttbar_mg5")
 +flist=HepSim.getList(url)
 +print flist
 +</code>
 +
 +Note that you can find URL list of datasets using the standard Python. 
 +
 +<code python>
 +name="tev100pp_pythia8_minbias_a14"
 +hepsim_server="http://atlaswww.hep.anl.gov/hepsim/"
 +import urllib2
 +response = urllib2.urlopen(hepsim_server+'/geturlmirrors.php?name='+name)
 +html = response.read()
 +response.close()
 +html=html.split(";")
 +print html # print a list of URL locations 
 +</code>
 +
 +If ProMC files were downloaded to the local file system, one can build the file list as:
 +
 +<code python>
 +from jhplot.utils import FileList
 +flist=FileList.get("./data","promc")
 +print flist
 +</code>
 +
 +Here "./data" is a directory with files, while "promc" indicated that we build the file list using the file extensions ".promc".
 +
 +Often, the output list contains multiple files generated with different pT or mass ranges. For example, a group of files generated with minimum pT=1000 GeV have names *_pt1000_*.py, while files generated with pt=5000 GeV have names *_pt5000_*.py. One can build a list with pT ranges using this method:
 +
 +<code python>
 +xmin=HepSim.getRanges(flist,"_pt","_");
 +</code>
 +This methods takes file list, and identify pT ranges given between the strings "_pt" and "_". In the above example, the output list will be [1000,5000].
 +
 +====== Using ProMC files ======
 +
 +
 +===== Reading the file header =====
 +
 +To read data from a ProMC file, you will need these statements:
 +
 +<code python>
 +from proto import FileMC        
 +f=FileMC("data.promc"             # get input file from current directory (or URL)
 +des=f.getDescription()              # get description
 +header = f.getHeader()              # to access information on this file
 +un=float(header.getMomentumUnit())  # conversion energy units
 +lunit=float(header.getLengthUnit()) # conversion length units
 +</code>
 +
 +The latter  2 values are needed to convert from integer presentation of 4-momenta, typically used for variable byte encoding to save space,
 +to double values. The values are typically around 1000-10000, depending on importance of representing low-momentum with a better precision. 
 +The header file also stores some additional information. Look at Java API
 +
 +  * [[https://atlaswww.hep.anl.gov/asc/promc/hepsim/doc/api/ | HepSim API]] for FileMC
 +  * [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMCHeaderFile.ProMCHeader.html | ProMCHeader]] for MC records
 +  * [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMCHeaderFile.ProMCHeader.html | ProMCHeader]] for NLO records
 +
 +
 +===== Reading the event records =====
 +
 +
 +<code python>
 +from proto import FileMC    
 +f=FileMC("data.promc"             # get input file from current directory (or URL)
 +header = f.getHeader()              # to access information on this file
 +un=float(header.getMomentumUnit())  # conversion energy units
 +lunit=float(header.getLengthUnit()) # conversion length units
 +for i in range(f.size()):           # run over all events in the file 
 +    eve = f.read(i)                 # ProMCEvent record for event "i"
 +    pa = eve.getParticles()         # particle information
 +</code>
 +
 +Look at Java API of the  ProMCEvent class:
 +
 +  * [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMC.ProMCEvent.Event.html  | Event ]] for MC
 +  * [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMC.ProMCEvent.Event.html  | Event ]] for NLO
 +
 +Here is the Java API for "Particles":
 +  * [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMC.ProMCEvent.Particles.html | Particles]] for MC
 +  * [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMC.ProMCEvent.Particles.html | Particles]] for NLO
 +
 +In the above example, "pa" is an object [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMC.ProMCEvent.Particles.html | Particles]] with particle information. We can create a typical Lorentz particle using the class [[https://datamelt.org/api/doc.php/hephysics/particle/LParticle|LParticle]] as:
 +
 +<code python>
 +from proto import FileMC  
 +from hephysics.particle import LParticle 
 +f=FileMC("data.promc"             # get input file from current directory (or URL)
 +header = f.getHeader()              # to access information on this file
 +un=float(header.getMomentumUnit())  # conversion energy units
 +lunit=float(header.getLengthUnit()) # conversion length units
 +for i in range(f.size()):           # run over all events in the file 
 +    eve = f.read(i)                 # ProMCEvent record for event "i"
 +    pa = eve.getParticles()         # particle information
 +    for j in range(pa.getPxCount()): 
 +           px=pa.getPx(j)/un         # pX
 +           py=pa.getPy(j)/un         # pY
 +           pz=pa.getPz(j)/un         # pZ
 +           e= pa.getEnergy(j)/un     # energy
 +           m= pa.getMass(j)/un       # mass
 +           p=LParticle(px,py,pz,e,m) # fill it
 +</code>
 +To access the PDG status code, us "pa.getPdgId(j) method. For stable particles, use the HEPEVT status code (the method "pa.getStatus(j)").
 +Here are the methods to retrieve information from the event records on a particle at a position j:
 +
 +  * **pa.getPdgId(j)** - get PDG id (int)
 +  * **pa.getStatus(j)** - get status information (int)
 +  * **pa.getPx(j), pa.getPy(j), pa.getPz(j), pa.getEnergy(j)** - get Px, Py, Pz, Energy of a particle (in int64 varint)
 +  * **pa.getX(j), pa.getY(j), pa.getZ(j), pa.getT(j)** - get (x, y, z, t)  position of  a particle (in int64 varint)
 +  * **pa.getMass(j)** - get mass (in int64 varint)
 +  * **pa.getMother1(j), pa.getMother2(j)** - get positions of first and second parent of the particle (int)
 +  * **pa.getDaughter1(j), pa.getDaughter1(j)** - get position of 1st and 2nd daughter (int)
 +  * **pa.getBarcode(j)** - get barcode (int, in filled)
 +
 +
 +
 +
 +Look at the API of the Lorentz-particle class:
 +
 +  * [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/particle/LParticle.html|LParticle]] - a simple Lorentz particle with transformations
 +  * [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/particle/HEParticle.html|HEParticle]] - a typical HEP particle based on the Lorentz particle with transformations
 +
 +In addition, you can build an ArrayList with particles using a convenient class [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api//hephysics/hepsim/PromcUtil.html|PromcUtil]].
 +Also, when possible, use the method "getParticle()" that returns a [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/particle/HEParticle.html|HEParticle]] particle with 4-momenta and postilion.
 +
 +<code python>
 +from proto import FileMC
 +from hephysics.hepsim import PromcUtil
 +f=FileMC("data.promc"             # get input file from current directory (or URL)
 +header = f.getHeader()              # to access information on this file
 +un=float(header.getMomentumUnit())  # conversion energy units
 +lunit=float(header.getLengthUnit()) # conversion length units
 +for i in range(f.size()):           # run over all events in the file 
 +    eve = f.read(i)                 # ProMCEvent record for event "i"
 +    pa = eve.getParticles()         # particle information
 +    par=PromcUtil.getParticleDList(f.getHeader(), pa, 1, -1, 1000) # List with particle 4-momenta
 +    print "Nr of final-state particles=",par.size()
 +</code>
 +The above example fills a list with all stable particles, without any cuts on pT and Eta (2 second arguments).
 +
 +The next example creates [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/particle/HEParticle.html|HEParticle]] using [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/hepsim/PromcUtil.htnl|PromcUtil]]:
 +<code python>
 +from proto import FileMC
 +from hephysics.hepsim import PromcUtil
 +f=FileMC("data.promc"             # get input file from current directory (or URL)
 +header = f.getHeader()              # to access information on this file
 +un=float(header.getMomentumUnit())  # conversion energy units
 +lunit=float(header.getLengthUnit()) # conversion length units
 +for i in range(f.size()):           # run over all events in the file 
 +    eve = f.read(i)                 # ProMCEvent record for event "i"
 +    pa = eve.getParticles()         # particle information
 +    for j in range(pa.getPxCount()): 
 +           particle=PromcUtil.getParticle(pa, un, lunit, j) # a HEP particle
 +    
 +</code>
 +
 +
 +
 +In this example, we fill a list with all stable particles (status "1"), without any pT cut ("-1") and maximum absolute value of pseudo-rapidity 1000.
 +The object "par" is a list with [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/jet/ParticleD.html|ParticleD]].
 +
 +Finality, let us extend the above example: we will create anti-kT jets using the list "par":
 +<code python>
 +from proto import FileMC
 +from hephysics.hepsim import PromcUtil
 +from hephysics.jet import JetN2 
 +
 +ktjet=JetN2(0.5,"antikt",20)        # initialize jets with R=0.5, E-mode, anti-KT,min pT=20
 +print ktjet.info()                  # print its settings
 +
 +f=FileMC("data.promc"             # get input file from current directory (or URL)
 +header = f.getHeader()              # to access information on this file
 +un=float(header.getMomentumUnit())  # conversion energy units
 +lunit=float(header.getLengthUnit()) # conversion length units
 +for i in range(f.size()):           # run over all events in the file 
 +    eve = f.read(i)                 # ProMCEvent record for event "i"
 +    pa = eve.getParticles()         # particle information
 +    par=PromcUtil.getParticleDList(f.getHeader(), pa, 1, -1, 1000) # List with particle 4-momenta
 +    ktjet.buildJets(par)            # build anti-kT jets              
 +    jets=ktjet.getJetsSorted()      # get a list with sorted jets
 +    if (len(jets)>0):
 +                 print "pT of a leading jet =",jets[0].perp()," GeV" 
 +</code>
 +To build anti-kT jets, we use [[http://jwork.org/dmelt/api/doc.php/hephysics/jet/JetN2.html|JetN2 clustering algorithm]] implemented in Java.
 +The description of this algorithm is given in [[http://arxiv.org/abs/1510.06638]].
 + It is a similar to FastJet N*N algorithm and typically shows 
 +a performance similar to the C++ analogue. Tests indicate that the JetN2 clustering algorithm is about twice slower that C++, but by a factor 20 faster than the traditional N^3 jet clustering algorithms. You can find benchmarking results in [[https://github.com/chekanov/hepjet/|HepJet]] github repository.
 +Look at a typical example of jet clustering in the [[http://atlaswww.hep.anl.gov/hepsim/code.php?item=88&code=qcd_pythia8_ptbins_jets.py| qcd_pythia8_ptbins_jets.py code]].
 +
 +===== Event statistics =====
 +
 +Before closing ProMC file, you may need to access event statistics, cross sections and errors on cross sections. 
 +
 +<code python>
 +from proto import FileMC    
 +f=FileMC("data.promc"                      # get input file from current directory (or URL)
 +stat = f.getStat()                           # get statistics
 +cross=stat.getCrossSectionAccumulated()      # accumulated cross section in pb
 +error=stat.getCrossSectionErrorAccumulated() # error on accumulated cross section in pb 
 +print "Cross section:", cross, "+/-", error, " pb"
 +f.close()
 +</code>
 +
 +Look at the API:
 +
 +  * [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMCStatFile.ProMCStat.html | ProMCStat]] for MC
 +  * [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMCStatFile.ProMCStat.html | ProMCStat]] for NLO
 +
 +
 +
 +====== Using ProIO files ======
 +
 +ProIO files have their own API (https://github.com/decibelcooper/proio).
 +It is at early stage of development. You can look at example of how to use ProIO here [http://atlaswww.hep.anl.gov/hepsim/info.php?item=325)
 +
 +====== Using ROOT files ======
 +
 +ROOT files are used to store Delphes fast simulations.
 +To analyses ROOT files, either use ROOT (see Delphes manual) or the [[https://atlaswww.hep.anl.gov/asc/jas4pp/|Jas4pp program]].
 +
 +
 +
 +
 +
 +====== Plots and histograms ======
 +
 +HepSim uses histogram packages supported by DataMelt (and implemented using JAIDA and FreeHep).
 +
 +Here are a few most common classes:
 +
 +  * [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/HPlot.html|HPlot]] - simple canvas to show graphs
 +  * [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/H1D.html|H1D]] - 1D histogram
 +  * [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/H1D.html|H2D]] - 2D histogram
 +  * [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/P1D.html|P1D]] - data container with support of errors
 +
 +The main canvas to show histograms [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/H1D.html|H1D]] and data points  [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/P1D.html|P1D]]  is [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/HPlot.html|HPlot]]. To process scripts in a background without a pop-up [[http://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/HPlot.html|HPlot]], use method "visible(False)", and set sys.exit(0) at the end of the scripts. 
 +
 +Usually, plots are saved in the vector-graphics SVG format. They can be converted on Linux to EPS as:
 +<code bash>
 +inkscape --without-gui --export-eps output.eps input.svg
 +</code>
 +However, one can also save images in PDF or EPS using the corresponding file extensions for the "export" command.
 +
 +In addition, data are saved in the form of XML (with the extension ".jdat") files. 
 +For example, one can read such files and create histograms or data objects as:
 +
 +<code python>
 +from jhplot.io import HBook
 +hb = HBook("http://mc.hep.anl.gov/asc/hepsim/events/pp/100tev/ttbar_mg5/macros/ttbar_mg5.jdat"
 +print hb.getKeys()         # print all the keys
 +p1=hb.get("1")
 +print p1.toString()
 +</code>
 +The object p1 belongs to the class [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/P1D.html|P1D]]. Analogously, one can store histograms etc.
 +
 +
 +====== HepSim API ======
 +
 +
 +All HepSim  Jython/Java validation scripts are based on a few classes. With such classes, you can make any observable from the particle lists
 +stored inside ProMC data files:
 +
 +===== Histograms and plots =====
 +
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/HPlot.html|HPlot]] - canvas to show X-Y data and histograms in 2D
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/HPlot.html|HPlot3D]] - canvas to show X-Y-Z data and histograms in 3D
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/H1D.html|H1D]] - 1D histogram
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/H1D.html|H2D]] - 2D histogram
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/P1D.html|P1D]] - X-Y container with support of 2-level errors
 +
 +See also other math and graphics classes in the [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/jhplot/package-summary.html | jhplot package]]. 
 +
 +===== Lorentz particles and Jets =====
 +
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/hepsim/PromcUtil.html|PromcUtil]] convenient method to fill arrays with particles from ProMC files
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/particle/LParticle.html|LParticle]] a HEP particle with the Lorentz transformations
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/jet/FastParticle.html|FastParticle]] a HEP particles with precomputed Et2,Eta,Phi for jet algorithms. 
 +  * [[https://java.freehep.org/freehep-physics/apidocs/hep/physics/vec/package-summary.html|Physics vectors]] typical HEP physics vectors the Lorentz transformations
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/jet/JetN2.html|JetN2]] recommended kt-type jet clustering ng algorithms (kT, anti-kT, CA) implemented in Java using N^2 approach, similar to the FastJet algorithm. Recommended for hadron collisions
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hepsimjar/api/hephysics/jet/SCJet.html|SCJet]] traditional kt-type jet clustering algorithms (kT, anti-kT) for pp implemented in Java using N^3 approach (slow) 
 +  * [[https://java.freehep.org/freehep-physics/apidocs/hep/physics/jet/package-summary.html|Jets and event shapes]] traditional jet algorithms and event shapes for e+e- (Geneva, Jade, Durham jets) 
 +  
 +  
 +Java implementation of the longitudinally invariant kT and anti-kT clustering algorithms uses the E-scheme to combine particles (p1+p2) and Eta-Phi space (not Rapidity-Phi). Also, Cambridge/Aachen jets are supported.   [[https://github.com/chekanov/hepjet/|HepJet on github]] shows  benchmarks of the Java implementation of the anti-KT with the original FastJet package.
 +
 +
 +You can build the standard kt-jets using [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/hepsim/PromcUtil.html | PromcUtil]], followed by the 
 +[[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/hephysics/jet/KTjet.html | KTjet]] class. 
 +
 +===== Working with ProMC files =====
 +
 +Also, look at the ProMC Java API that is used to store data:
 +
 +  * [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMCHeaderFile.ProMCHeader.html | ProMCHeader]] for MC and [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMCHeaderFile.ProMCHeader.html | ProMCHeader]] for NLO
 +  * [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMC.ProMCEvent.Event.html  | Event ]] for MC and [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMC.ProMCEvent.Event.html  | Event ]] for NLO
 +   * [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMC.ProMCEvent.Particles.html | Particles]] for MC and  [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMC.ProMCEvent.Particles.html | Particles]] for NLO
 +  * [[https://atlaswww.hep.anl.gov/asc/promc/doc/api/promc/io/ProMCStatFile.ProMCStat.html | ProMCStat]] for MC and [[http://atlaswww.hep.anl.gov/asc/promc/doc/api/pronlo/io/ProMCStatFile.ProMCStat.html | ProMCStat]] for NLO
 +
 +
 +It is simplified compared to full MC simulations, and also includes additional arrays "idata" and "fdata" to store
 +information on event weights and uncertainties.
 +
 +The project uses the community edition of [[https://datamelt.org/|DataMelt]] project and  the FreeHEP Java Libraries.
 +
 +
 +
 +====== GROOT API ======
 +
 +In addition, this tool supports GRoot histograms and canvases (see [[https://github.com/gavalian/groot/wiki| GRoot poject]]). The syntax for histogram creation and plotting  is similar to the PyROOT commands, thus it is easier
 +to use this package for PyROOT users.
 +You can find the Java API of this package in [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/|here]].
 +
 +
 +The following histograms are available:
 +
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/org/jlab/groot/data/H1F.html|H1F 1D histogram]]
 +  * [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/org/jlab/groot/data/H1F.html|H2F 2D histogram]]
 +
 +They can be plotted on the [[https://atlaswww.hep.anl.gov/hepsim/hepsimjar/api/org/jlab/groot/graphics/EmbeddedCanvas.html|EmbeddedCanvas]].
 +
 +
 + --- //[[[email protected]|Sergei Chekanov]] 2015/03/06 08:49//
hepsim/usage_analysis.txt · Last modified: 2024/07/01 21:25 by 127.0.0.1