You are a guest. Restricted access. Read more.

<< back

ProMC Examples

(written by S.Chekanov, ANL)

ProMC binary files are very compact and self-describing files, typically 30-50% smaller than ROOT files or gzipped HEPMC files. Look at the examples below which show how to write,read, browser and convert HEPMC files to ProMC files. More information is in the Introduction.

After the installation, all examples are located in the directory:

$PROMC/examples

File browser

This is very simple example. You can work with ProMC files without installing the ProMC package. You need any Linux/Windows/Mac with installed Java7 (check this as “java -version”, it should show 1.7.X version). This is the example for Linux/Mac:

wget  http://atlaswww.hep.anl.gov/asc/promc/download/Pythia8.promc      # get example file
wget  http://atlaswww.hep.anl.gov/asc/promc/download/browser_promc.jar  # get browser
java -jar browser_promc.jar  Pythia8.promc                              # lunch the browser

This will bring up a GUI window so one can look at separate events and the data layout (see the example below).

On Windows: click “browser_promc.jar” and open this file in the browser as: [File]→[Open file]. Read more details in the manual.

If the ProMC is installed, simply run promc_browser <file>.promc> command.

Reading a ProMC file with unknown data layout

ProMC files are self-describing. You can generate analysis codes in C++, Java, Python if you happen to have a ProMC file but do not know how the data are organized inside the file. You need to install ProMC.

Let us assume we want to read Monte Carlo events generated for the Snowmass 2013 studies. Such events were processed by the Delphes fast simulations. In addition to truth particle record, thus they have reconstructed jets, muons, photons etc. Here are the steps to read such data in C++/Java/Python:

mkdir test    # create a test directory
cd test
wget http://atlaswww1.hep.anl.gov/asc/snowmass2013/delphes36/promc/MadGraph5Pythia_wjets/MadGraph5Pythia_wjets_mu0.promc
promc_info  MadGraph5Pythia_wjets_mu0.promc # check information about this file
promc_proto MadGraph5Pythia_wjets_mu0.promc # extracts data layouts into the directory "proto"
promc_code                                  # generate analysis code in src/, java/, python/
make                                        # compiles C++ code reader.cc
./reader MadGraph5Pythia_wjets_mu0.promc    # runs the C++ analysis code

The command “promc_proto” is important. It generates a platform-neutral layout of stored data. The command “promc_code” generates language-specific source codes in C++ (src/), Java (java/) and Python (python/). One can also use the Java browser to look at detailed information of this file (see later).

Now you can modify the program “reader.cc”. Look at the data layout in “proto/ProMC.proto”. The corresponding C++ source code is given in “src” (see src/ProMC.* files). Look at the language guide protocol-buffers used for such files.

You can also run over this file using Java (without C++ -dependent libraries). Go to the directory “java” and run the example “run.sh”.

cd java
run.sh ../MadGraph5Pythia_wjets_mu0.promc

You can modify the example code “ReadProMC.java” and check the available methods in “src/promc/io/ProMC.java”

The command “promc_code” also generates a code example in Python. Go to the directory “python” and run the script:

cd python
python reader.py ../MadGraph5Pythia_wjets_mu0.promc

Modify the analysis code as needed. You can look at Python modules in the directory “modules”.

Writing ProMC files

Let us write “fake” Monte Carlo events (created using random numbers) and read them back. Copy the “examples” directory from the installation directory and run the examples. We assume that you have setup PROMC by running “source setup.sh”. See the section Installation

cp -r $PROMC/examples .
cd examples/random/
ln -s $PROMC/proto/promc  proto  # append data-layout files
promc_code                       # create analysis source codes (C++ in src, Java in java/, Python in python/)
make                             # compile the code
./writer                         # write random events to /out. Also it writes TXT files (similar to HEPMC)

The example generates the file “out/output.promc” with “fake” ProMC event records. In addition, it dumps the same information to ASCII file out/output.txt” for comparison.

It is a good practice to have a directory “proto” with ProtoBuffers files used to create the event records, so later one can generate C++, Java, Python code for reading data stored inside the ProMC files. This makes the output file “self-describing”. To make the file self-describing, we should create a directory “proto” and put ProtoBuffers files used to create event record into this directory. Or, one can also link the existing default “proto” directory. The command “promc_code” generates C+ and Java codes using the existing “proto” directory with data description.

Or even better: One can include the logfile to the ProMC file, if it has the name ”logfile.txt”. This can be important since there is no need for keeping separate log files for the ProMC files. To append log file, just make sure that ”logfile.txt” is located in the same directory:

./writer > logfile.txt 2>&1         # logfile.txt will be attached to the ProMC file

In this case, “logfile.txt” is created and automatically embedded (and compressed) inside the “out/output.promc” file. You can extract this file later as “promc_log file.promc”.

Now, try to read the “out/output.promc” file as:

./reader  # read the file in out/output.promc

For 1000 events with 5000 particles (using a record similar to HEPMC), the output is

174 K   event.proto     # single event record as ProtoBuf message
105 MB  output.promc    # all events in one ProMC file
325 MB  output.txt      # all events as a text file

The ProMC has a factor 3.1 smaller file output than the TXT. After gzipping the text file, you can compare gzip version and ProMC output:

136 MB  output.txt.gz

so the ProMC has 30% smaller file size (105 MB) than gzip compressed file, indicating improvements in the compression. The improvement in the compression depends on the pT distribution. In the above example, we used exponential random numbers to mimic Px,Py,Pz spectra. If the pT spectrum decay faster (which is normally the case for pileup), the compression will be better. If the moment is distributed using flat distribution, there is no much improvement compared to “gzipped” version of TXT file.

This example still assumes that we store particle masses for each particle. Using built-in map in the header record, we can set masses to 0 for most common particles. This should further bring down the file size (estimated by 10%).

You can access metadata information (description, proto files used to generate event records and logfile) as explained in Introduction section.

Reading data using Java

The above example also generates the code in Java. See the directory “java”. Make sure that Java7 is installed (type java -version). Go to this directory and run:

./run.sh ../out/output.promc

You can modify the example code “ReadProMC.java” and check the available methods in “src/promc/io/ProMC.java”

Reading data using Python

The above example also generates the code in Python. See the directory “python”. Go to this directory and run:

python reader.py ../out/output.promc

You can modify the example code “reader.py” and check the available methods in “modules/”

C++ examples. Filling ROOT file from ProMC

The example given in the directory “examples/root” shows how to fill ROOT tree from ProMC record. We assume that the file “output.promc” from the previous example was already created. Go to ” examples/root” and type “make” (ROOT should be installed). Then dump the ProMC record to the ROOT Tree file. We assume that 4-momenta is written as “Double32_t” (written as a 4 bytes floats). The output file will be found “out/output.root”

Benchmark summary

Here is a comparison of file sizes for the same event records written by “examples/random/out/output.promc”. The benchmarks is done using examples from “examples” directory

File format Size (MB)
ASCII TXT file (“HEPMC”) 346
gzipped ASCII TXT file 138
ROOT tree (Double32_t) 158
ProMC 112

As one can see, the ProMC files are 40% more compact than the equivalent ROOT files and 23% more compact than gzipped ASCII TXT (“HEPMC”).

Writing Pythia8 event record

Here is a small example of how to write Pythia8 event record using the PROMC data format in the directory

$PROMC/examples/pythia

In the Makefile, make sure that you see the ProMC include files need for compilation (it assumes that PROMC variable is set after running setup.sh). You should setup the location of PYTHIA and HEPMC (this example also writes HEPM events for comparison)

-I${PROMC}/include

You will need to link 2 libraries:

 -L${PROMC}/lib -lprotoc -lcbook

An example program which fills ProMC file record is given here writer_pythia.cc. The make file is located here Makefile

An example which shows how to read such PROMC file can be found in “examples/random/reader.cc”.

ProMC File Browser

You can browser events and other information stored in ProMC files using a browser (implemented in Java and runs on Linux/Windows/Mac). First, get the browser:

wget  http://atlaswww.hep.anl.gov/asc/promc/download/browser_promc.jar

And run it as (it assumes Java7 and above)

java -jar browser_promc.jar

Now we can open a ProMC file. Let's get an example ProMC file which keeps 1,000 events generated by Pythia8. We download this file and run several commands to check what is inside:

wget  http://atlaswww.hep.anl.gov/asc/promc/download/Pythia8.promc

Open this file in the browser as: [File]→[Open file]. Or you can open it using the prompt:

java -jar browser_promc.jar Pythia8.promc

This opens the file and shows the metadata (i.e. information stored in the header and statistics records):

Note: For Linux/Mac: you can ope the browser as:

promc_browser Pythia8.promc

On the left, you will see event numbers. Double click on any number. The browser will display the event record with all stored particles for this event (PID, Status,Px,Py,Pz, etc).

You can access metadata information on stored particle data, such as particle types, PID and masses using the [Metadata]→[Particle data] menu. This information is common for all events ProMC does not store particle names and masses for each event to save space.

If the ProMC file was made “self-describing” and stores templates for data layouts used to generate analysis code, you can open the “Data layout” menu:

You can look at event information (process ID, PDF, alphaS, weight) if you navigate with the mouse to the event number on the left, and click on the right button. You will see a pop-up menu. Select “Event information”.

ProMC vs HepMC

This event record in Pythia8.promc can be always compared with the standard HEPMC file (in gzipped form). Get the equivalent HEPMC file:

wget  http://atlaswww.hep.anl.gov/asc/promc/download/Pythia8.hepmc.gz

You can see that the equivalent HEPMC file after the standard compression (gzip) is about 107MB, while the corresponding information can be fit to 29MB of the ProMC record.

Reading and writing in Java

One can read event records generated by ProMC in Java (naitively), without external C++ libraries. Look at the example in “examples/random/java”. Assuming that “output.promc” was already generated inside “examples/random/java/”, you can execute this example as

run.sh  ../output/output.promc  # compiles Java code and reads this file

This compiles the Java class ReadProMC which reads the “output.promc” file.

Reading data using Jython

You can make histograms on any platform (Windows/Linux/Mac) when using SCaVis. Copy “browser_promc.jar” file from “example/browser/” of the installation directory to the directory “lib/user” of the ScaVis installation. To avoid a clash with the library shipped with Scavis, remove “promc-protobuf.jar” inside the Scavis installation directory.

rm   lib/system/promc-protobuf.jar

(in Windows, go to this directory and remove this file) and restart the ScaVis. Then you can write a small Jython script like this:

# Reading Pthia8 file in the ProMC format using ScaVis http://jwork.org/scavis
# S.Chekanov (ANL)
from java.io import *
from java.awt import *
from promc.io import *
from proto import *   # import FileMC
from jhplot import *  # import ScaVis graphics
 
file = FileMC("Pythia8.promc")
print "ProMC version=",file.getVersion()
print "Last Modified=",file.getLastModified()
header = file.getHeader()                     # get header file
unit=float(header.getMomentumUnit())
lunit=float(header.getLengthUnit())
print "Momentum unit=",unit
print "Length unit=",lunit
 
for j in range(header.getParticleDataCount()): # look at PDG info stored in the header
  d = header.getParticleData(j)
  pid = d.getId(); mass = d.getMass(); name = d.getName();
  print name, pid, mass
 
h1= H1D("Px",100,0,10)   # create a histogram
print "File size=",file.size()
 
for i in range(file.size()):  # run over all events
      if (i%100==0): print "Event=",i
      entry = file.read(i)
      p = entry.getParticles()                 # get particles
      for j in range( p.getPxCount() ):
            h1.fill(p.getPx(j)/unit)
c1 = HPlot("Canvas",600,400) # plot histogram
c1.visible()
c1.setAutoRange()
c1.draw(h1)
c1.export("px.pdf")                          # create PDF file

Now copy a ProMC file :

wget  http://atlaswww.hep.anl.gov/asc/promc/download/Pythia8.promc

Start SCaVis and run this script. It will show the Px spectra for all stored particles. (You can load this file as “scavis.sh promc.py”)

Random access

You can extract a given record/event using a random access capabilities of this format. Check the example in “examples/random_access”. Type make to compile it and run the code. You can see that we can extract the needed event using the method “event(index)”.

Reading data remotely (with random access)

You can read or stream data from a remote server without downloading files. The easiest is to to use Python reader (see the example in examples/python). Below we show to to read one single event (event=100) remotely using Python:

# Chekanov. Shows how to read event from a remote file with MC events import urllib2, cStringIO, zipfile

url = "http://atlaswww1.hep.anl.gov/asc/snowmass2013/delphes36/TruthRecords/higgs14tev/pythia8/pythia8_higgs_1.promc"
 
try:
    remotezip = urllib2.urlopen(url)
    zipinmemory = cStringIO.StringIO(remotezip.read())
    zip = zipfile.ZipFile(zipinmemory)
    for fn in zip.namelist():
        # print fn
        if fn=="100":
             data = zip.read(fn)
             print "Read event=100"
except urllib2.HTTPError:
       print "no file"

In this example. “data” represents a ProMC event record. Look at the example in the example in examples/python how to print such info.

HEPMC to PROMC converter

A prototype C++ converter exists which converts a HEPMC file into the ProMC file (hepmc2promc, written in C++). You should specify HEPMC directory during the installation as described in Installation. Look also at the example code in examples/hepmc2promc in the installation directory. Assuming that the converter is installed and you run setup.sh script, the syntax for conversion is:

hepmc2promc [input HEPMC file] [Output ProMC file] [Description]

you may have “logfile.txt” in the same directory with the additional information (it will be attached to ProMC automatically). You can also do the opposite: convert a ProMC file to HepMC file with the command “promc2hepmc”. Here is a small example:

wget  http://atlaswww.hep.anl.gov/asc/promc/download/HiggsTTbar.hepmc.gz
gunzip HiggsTTbar.hepmc.gz
hepmc2promc HiggsTTbar.hepmc HiggsTTbar.promc "Higgs plus ttbar at 14 TeV"

This will create the PROMC file HiggsTTbar.promc. To look at the events, run the browser:

java -jar browser_promc.jar HiggsTTbar.promc

Double click the event number (on the left) and you will see how a single event look like.

If you have a problem with converter, just get this PROMC file as:

wget  http://atlaswww.hep.anl.gov/asc/promc/download/HiggsTTbar.promc
java -jar browser_promc.jar HiggsTTbar.promc

Examples of data layouts

You can keep data using any complex layout. Here are a few examples of event layouts:

  1. ProMC.proto - This is simplest data layout to keep only truth particle information. It is shipped with the default ProMC installation.
  2. ProMC.proto for Delphes - This is a more complicated data layout suitable for data and reconstructed MC. It shows how to include reconstructed objects (jets,photons, muons)
  3. ProMC.proto for Delphes plus jet constituents - This is more complicated layout. It shows how to include clusters (jet constituents) for each jet

ProMC file manipulation

Read section to learn about ProMC commads. One can dump events, get information, extract and save a smaller number of events into a new files. Example:

promc_info file.promc                        # get info
promc_dump file.promc                        # dump events
promc_extract   file.promc  new.promc 10     # save 10 events in a new file  new.promc
promc_log                                    # extract log file (if attached)

A ProMC file is a simple zip file with ProtoBuffer messages. One can break out a ProMC file into pieces as:

unzip file.promc

Then you can assemble files back using “zip” command. You can also merge, add etc ProMC files.

Accessing data in PHP

You can access entries of PROMC files in PHP. Currently, you can read entries, version and description entry. This is an example. Run it as “php test.php”

<?php
$zip = zip_open("MadGraph5Pythia_wjets_mu0.promc");
if ($zip) {
    while ($zip_entry = zip_read($zip)) {
        echo "Name:               " . zip_entry_name($zip_entry) . "\n";
        echo "Actual Filesize:    " . zip_entry_filesize($zip_entry) . "\n";
        echo "Compressed Size:    " . zip_entry_compressedsize($zip_entry) . "\n";
        echo "Compression Method: " . zip_entry_compressionmethod($zip_entry) . "\n";
        if (zip_entry_name($zip_entry) == "promc_description" || zip_entry_name($zip_entry) == "version") {
        if (zip_entry_open($zip, $zip_entry, "r")) {
            echo "File Contents:\n";
            $buf = zip_entry_read($zip_entry, zip_entry_filesize($zip_entry));
            $buf = preg_replace('/[^(\x20-\x7F)]*/','', $buf);
            echo "$buf\n";
 
            zip_entry_close($zip_entry);
        }
        }
        echo "\n";
 
    }
    zip_close($zip);
}
?>

Where and how one can use ProMC

The default version of ProMC has 2 records per event: Event (event information) and “Particles” (truth particle information).

You can also use a modified version which keeps reconstructed objects, such as “Jets”, “Electrons”, “Photons”, “Muons”. ProMC is used for Snowmass2013 to keep Delphes fast simulation files. See the Snowmass web page. However, it is still in a prototype stage.

Navigation

Print/export