Sorting Data: Difference between revisions

From HELIOS Digital DAQ
Jump to navigation Jump to search
 
(7 intermediate revisions by 2 users not shown)
Line 4: Line 4:
* '''GEBMerge''' is used to time-stamp sort all digitizer / ioc data files of type ''gtd''
* '''GEBMerge''' is used to time-stamp sort all digitizer / ioc data files of type ''gtd''


* '''GEBSort_nogeb''' is used to create ''events'' in the data and process the data if desired. Presently, it passes all of the header information to a ROOT TTree file. Trace data can also be passed if so chosen.
* '''EventBuilder''' is used to create ''events'' in the data and process the data if desired. Presently, it passes all of the header information to a ROOT TTree file. Trace data can also be passed if so chosen.
 
* '''EventBuilder_trace''' is same as GEBSort_nogeb, but with trace
 


Both files have ''*.chat'' files that define the parameters of the sort. For example, the number of pieces of data to sort, or the coincidence window defining an ''event''.
Both files have ''*.chat'' files that define the parameters of the sort. For example, the number of pieces of data to sort, or the coincidence window defining an ''event''.
Line 41: Line 44:
  - from online globus point and login to LCRC
  - from online globus point and login to LCRC
  - can setup script on 'local' machine to 'push' data to LCRC
  - can setup script on 'local' machine to 'push' data to LCRC
  -  
  - go to transfer after logging into globus.org
- Find "local connect" then choose LCRC DTN
- "local connects" on both DigiOS [DigiOS DAQ] and Sonata
- select proper directories (may have to login to LCRC)
- click options for transfer, click arrow to start transfer


sorting data on LCRC
sorting data on LCRC
Line 55: Line 62:


Submitting a bash script with ''qsub'' and the ''PBS''
Submitting a bash script with ''qsub'' and the ''PBS''
example script that I got working (rcnp_selector.pbs):
example script that I got working (rcnp_selector.pbs) or link to better description ([[File:PBS Script 0.pdf|thumb]]):


  #!/bin/sh
  #!/bin/sh
Line 70: Line 77:
the run
the run
  qsub rcnp_selector.pbs
  qsub rcnp_selector.pbs
submits job in the que, gives it an id number as well,
submits job in the que, gives it an id number as well,
must wait for it to run, various commands to check on it
must wait for it to run, various commands to check on it
1664586.bmgt1.lcrc.anl.gov


  showq [-i] [-u uname]
  showq [-i] [-u uname]
Line 89: Line 97:


This reached ~100 MB/s with computer directly connected to either the NAT Box or the switch
This reached ~100 MB/s with computer directly connected to either the NAT Box or the switch
Trying to setup a globusconnect point on the DigiOS DAQ computer directly
www.globus.org
e3fe9032-34fc-49f1-a9a7-ea98bc99330b
Execute
globusconnect
To run the globusconnect and to connect on DigiOS, then need to login to the LCRC to be able to transfer the data


== Example ==
== Example ==

Latest revision as of 19:19, January 21, 2021

Overiew

This is a guide to help sort large amounts of DigiOS data using the GEBMerge and GEBSort_nogeb procedures.

  • GEBMerge is used to time-stamp sort all digitizer / ioc data files of type gtd
  • EventBuilder is used to create events in the data and process the data if desired. Presently, it passes all of the header information to a ROOT TTree file. Trace data can also be passed if so chosen.
  • EventBuilder_trace is same as GEBSort_nogeb, but with trace


Both files have *.chat files that define the parameters of the sort. For example, the number of pieces of data to sort, or the coincidence window defining an event.

LCRC Computing

www.lcrc.anl.gov

i) Request account ii) get SSH key iii) login to blues.lcrc -> on the login node iv) setup "soft" environment (load software to evn) v) compile code in home

Useful commands

[crhoffman@blogin1 rcnp]$ lcrc-qbank -q balance
Project                   Balance (corehours)
------------------------- -------------------
*HELIOS                                 50000
startup-crhoffman                        8000

For example, to change your default project from startup-joeuser to anl-mcs-support, you would type the following command on any of the login nodes:

[joeuser]$ lcrc-qbank -s default anl-mcs-support
Previous Default Project: startup-joeuser
New Default Project: anl-mcs-support

transfer data

- using globusconnectpersonal
- create globus account online - w/ ANL
- LOGIN
- run globusconnectpersonal locally
- setting up for Sonata 022a3522-00d9-4fee-9235-7fe797e56736
- downloaded on Sonata - requires Tcl/Tk
- from online globus point and login to LCRC
- can setup script on 'local' machine to 'push' data to LCRC
- go to transfer after logging into globus.org
- Find "local connect" then choose LCRC DTN
- "local connects" on both DigiOS [DigiOS DAQ] and Sonata
- select proper directories (may have to login to LCRC)
- click options for transfer, click arrow to start transfer

sorting data on LCRC

- using the Proof with root selectors, e.g., rcnp data
- how to run interactive job

If you need to use a remote X11 display from within your job , add the -v DISPLAY option to qsub as well (this is for 1 minute of time): $ qsub -I -l walltime=00:01:00 -v DISPLAY waiting for job 101.blues.lcrc.anl.gov to start job 101.blues.lcrc.anl.gov ready To quit your interactive job, run: $ logout

Submitting a bash script with qsub and the PBS example script that I got working (rcnp_selector.pbs) or link to better description (File:PBS Script 0.pdf):

#!/bin/sh

#PBS -N rcnp
#PBS -l nodes=1:ppn=16
#PBS -l walltime=0:05:00

cd $PBS_O_WORKDIR
cd /home/crhoffman/rcnp_benchmark/rcnp
pwd
root -l -b /home/crhoffman/rcnp_benchmark/rcnp/RunSelector_LCRC.C

the run

qsub rcnp_selector.pbs

submits job in the que, gives it an id number as well, must wait for it to run, various commands to check on it

1664586.bmgt1.lcrc.anl.gov
showq [-i] [-u uname]

when finished, get two files in home directory

rcnp.e1664448  rcnp.o1664448

to do ... setting up multiple instances of same process 'swift multiple run numbers at once

to do...

Transferring Data

When a computer is directly connected to the LAN of NAT Box, simply use the rsync command with the ip address of the digios computer 192.168.1.2

rsync -avh --progress [email protected]:/location/of/data/* .

This reached ~100 MB/s with computer directly connected to either the NAT Box or the switch

Trying to setup a globusconnect point on the DigiOS DAQ computer directly

www.globus.org
e3fe9032-34fc-49f1-a9a7-ea98bc99330b

Execute

globusconnect

To run the globusconnect and to connect on DigiOS, then need to login to the LCRC to be able to transfer the data

Example

  • Go to /music/helios2/DigiOS/working
  • Check that data file(s) in data directory
ls -ltr ../data
-r--r--r-- 1 crhoffman helios 21250344 Oct  6 11:12 data_run_004.gtd_000_0103
-r--r--r-- 1 crhoffman helios 1801997700 Oct  6 11:15 data_run_004.gtd_000_0102
  • Merge run 004
./gebmerge 004
RUN 004: GEBMerge started at Tue Oct 11 16:07:03 CDT 2016
RUN 004: GEBMerge DONE at Tue Oct 11 16:07:40 CDT 2016

If you get an error about no GEBMerge, go to GEBSort directory and type

make GEBMerge
  • Next run GEBSort_nogeb to generate root file
./gebsort 004
GEBSort started sorting run 004 at Tue Oct 11 16:11:16 CDT 2016
from GEBSort.cxx on Oct  7 2016 15:45:02
...tons of other crap...
done saving rootfile "/music/helios2/DigiOS/root_data/run004.root
read 611072 events; Pars.beta=0.0000; time since last update 0 minutes
CommandFileName="GEBSort.command"

sorted timestamp range 594573853956-->7526345374265: 6931771520309
that is 69317.7 seconds or 19h15m17s
^^^^^ any timestamp jumps will upset this accounting

hit statistics per type
14 GEB_TYPE_DGS             1957499 ;     28.24 Hz 
read a total of              1957499 ; header/payloads

Duration of sort in real time 45.02 s
Sort time ratio (real t / dTS) SMALLER IS BETTER:    0.00065

boniva sancta! ...GEBSort (unexpectedly) did not crash!

** GEBSort is done at Tue Oct 11 16:12:03 2016

GEBSort DONE at Tue Oct 11 16:12:03 CDT 2016
  • Now you can do what you want with the root file, (note: may have to run ssetup root_v5.32.00 for root to work), e.g.,
root -l ../root_data/run004.root 
tree->Process("../codes/Energy.C+")

Will run a little selector to generate E vs Z [figure below], noting that in this case E was determined from the difference between the pre / post rise sums as determined by the firmware, e.g., not using the trace data.

Energy vs. Z as generated by scripts in code directory using rootfile from GEBSort_nogeb.

Locations of Working Sort Codes

As of now, I (CRH) have been using a set of codes on malaguena.onenet. I login using my own username (not helios@phy), and I reccommend this for everyone. If we need to change some permissions, or get people on the same group we can do it. But it will be confusing to have everyone as helios.

The current working directory for the GEBMerge and GEBSort_nogeb codes is /music/helios2/DigiOS/GEBSort

Other directories in music/helios2/DigiOS include

codes  data  GEBSort  merged_data  root_data  working

As you will see below, running GEBMerge and GEBSort_nogeb can be involved, hence, in the working directory, there are a few *.sh scripts to help run them, e.g., to merge run 004 in the data directory, in working type

gebmerge.sh 004

Please note that the file locations are hardcoded at this point so change them if you intend to play around with things!

Running GEBMerge

It is run in the following way (NOTE: A map.dat file is required in execution directory even if it is empty??) :

GEBMerge merge_chat_file.chat merged_output_file.gtd data_file(s).gtd

The merge_chat_file.chat includes a number of parameters defining the sort. My (CHR) current working on is in /music/helios2/DigiOS/working

Also, a more simple way to run the merge is to use the gebmerge.sh script as

gebmerge.sh 004

to sort the data run 004 in the data directory

Runing GEBSort_nogeb

Please note, if you modify GEBSort, you want to make GEBSort_nogeb (NO Global Event Builder), otherwise it will fail because it is meant to get realtime data from Gretina (i think).

GEBSort_nogeb is run by

GEBSort_nogeb -input input_type [input_data_file.gtd] -rootfile
output_root_file.root rootfile_option -chat chat_file.chat
  • input_type -> disk : to indicate from file (not online)
  • input_data_file.gtd : can be merged or single-IOC file (I believe)
  • rootfile_option -> RECREATE or UPDATE : whether to create new or add to the root file specified

Other files required by GEBSort_nogeb (all to be placed in working directory)

  • map.dat : gives ID’s and types of channels (can be an empty file, but must be there it seems)

GEBSort_nogeb OUTPUT

The present version of the code uses the bin_rcnp.* files to generate a ROOT TTree with the information from the header. There are still some issues with the full readout of the header, but all information for timestamps and energy is there now (extra stuff is for better P/Z corrections, etc.). The root TTree should be nearly identical to the parameters given in the typedef struct DGSEVENT' which is found in GTMerge.h