Sorting Data
Overiew
This is a guide to help sort large amounts of DigiOS data using the GEBMerge and GEBSort_nogeb procedures.
- GEBMerge is used to time-stamp sort all digitizer / ioc data files of type gtd
- GEBSort_nogeb is used to create events in the data and process the data if desired. Presently, it passes all of the header information to a ROOT TTree file. Trace data can also be passed if so chosen.
- GEBSort_nogeb_trace is same as GEBSort_nogeb, but with trace
Both files have *.chat files that define the parameters of the sort. For example, the number of pieces of data to sort, or the coincidence window defining an event.
LCRC Computing
www.lcrc.anl.gov
i) Request account ii) get SSH key iii) login to blues.lcrc -> on the login node iv) setup "soft" environment (load software to evn) v) compile code in home
Useful commands
[crhoffman@blogin1 rcnp]$ lcrc-qbank -q balance Project Balance (corehours) ------------------------- ------------------- *HELIOS 50000 startup-crhoffman 8000
For example, to change your default project from startup-joeuser to anl-mcs-support, you would type the following command on any of the login nodes:
[joeuser]$ lcrc-qbank -s default anl-mcs-support Previous Default Project: startup-joeuser New Default Project: anl-mcs-support
transfer data
- using globusconnectpersonal - create globus account online - w/ ANL - LOGIN - run globusconnectpersonal locally - setting up for Sonata 022a3522-00d9-4fee-9235-7fe797e56736 - downloaded on Sonata - requires Tcl/Tk - from online globus point and login to LCRC - can setup script on 'local' machine to 'push' data to LCRC - go to transfer after logging into globus.org - Find "local connect" then choose LCRC DTN - "local connects" on both DigiOS [DigiOS DAQ] and Sonata - select proper directories (may have to login to LCRC) - click options for transfer, click arrow to start transfer
sorting data on LCRC
- using the Proof with root selectors, e.g., rcnp data - how to run interactive job
If you need to use a remote X11 display from within your job , add the -v DISPLAY option to qsub as well (this is for 1 minute of time): $ qsub -I -l walltime=00:01:00 -v DISPLAY waiting for job 101.blues.lcrc.anl.gov to start job 101.blues.lcrc.anl.gov ready To quit your interactive job, run: $ logout
Submitting a bash script with qsub and the PBS example script that I got working (rcnp_selector.pbs) or link to better description (File:PBS Script 0.pdf):
#!/bin/sh #PBS -N rcnp #PBS -l nodes=1:ppn=16 #PBS -l walltime=0:05:00 cd $PBS_O_WORKDIR cd /home/crhoffman/rcnp_benchmark/rcnp pwd root -l -b /home/crhoffman/rcnp_benchmark/rcnp/RunSelector_LCRC.C
the run
qsub rcnp_selector.pbs
submits job in the que, gives it an id number as well, must wait for it to run, various commands to check on it
1664586.bmgt1.lcrc.anl.gov
showq [-i] [-u uname]
when finished, get two files in home directory
rcnp.e1664448 rcnp.o1664448
to do ... setting up multiple instances of same process 'swift multiple run numbers at once
to do...
Transferring Data
When a computer is directly connected to the LAN of NAT Box, simply use the rsync command with the ip address of the digios computer 192.168.1.2
rsync -avh --progress [email protected]:/location/of/data/* .
This reached ~100 MB/s with computer directly connected to either the NAT Box or the switch
Trying to setup a globusconnect point on the DigiOS DAQ computer directly
www.globus.org
e3fe9032-34fc-49f1-a9a7-ea98bc99330b
Execute
globusconnect
To run the globusconnect and to connect on DigiOS, then need to login to the LCRC to be able to transfer the data
Example
- Go to /music/helios2/DigiOS/working
- Check that data file(s) in data directory
ls -ltr ../data -r--r--r-- 1 crhoffman helios 21250344 Oct 6 11:12 data_run_004.gtd_000_0103 -r--r--r-- 1 crhoffman helios 1801997700 Oct 6 11:15 data_run_004.gtd_000_0102
- Merge run 004
./gebmerge 004 RUN 004: GEBMerge started at Tue Oct 11 16:07:03 CDT 2016 RUN 004: GEBMerge DONE at Tue Oct 11 16:07:40 CDT 2016
If you get an error about no GEBMerge, go to GEBSort directory and type
make GEBMerge
- Next run GEBSort_nogeb to generate root file
./gebsort 004 GEBSort started sorting run 004 at Tue Oct 11 16:11:16 CDT 2016 from GEBSort.cxx on Oct 7 2016 15:45:02 ...tons of other crap... done saving rootfile "/music/helios2/DigiOS/root_data/run004.root read 611072 events; Pars.beta=0.0000; time since last update 0 minutes CommandFileName="GEBSort.command" sorted timestamp range 594573853956-->7526345374265: 6931771520309 that is 69317.7 seconds or 19h15m17s ^^^^^ any timestamp jumps will upset this accounting hit statistics per type 14 GEB_TYPE_DGS 1957499 ; 28.24 Hz read a total of 1957499 ; header/payloads Duration of sort in real time 45.02 s Sort time ratio (real t / dTS) SMALLER IS BETTER: 0.00065 boniva sancta! ...GEBSort (unexpectedly) did not crash! ** GEBSort is done at Tue Oct 11 16:12:03 2016 GEBSort DONE at Tue Oct 11 16:12:03 CDT 2016
- Now you can do what you want with the root file, (note: may have to run ssetup root_v5.32.00 for root to work), e.g.,
root -l ../root_data/run004.root tree->Process("../codes/Energy.C+")
Will run a little selector to generate E vs Z [figure below], noting that in this case E was determined from the difference between the pre / post rise sums as determined by the firmware, e.g., not using the trace data.
Locations of Working Sort Codes
As of now, I (CRH) have been using a set of codes on malaguena.onenet. I login using my own username (not helios@phy), and I reccommend this for everyone. If we need to change some permissions, or get people on the same group we can do it. But it will be confusing to have everyone as helios.
The current working directory for the GEBMerge and GEBSort_nogeb codes is /music/helios2/DigiOS/GEBSort
Other directories in music/helios2/DigiOS include
codes data GEBSort merged_data root_data working
As you will see below, running GEBMerge and GEBSort_nogeb can be involved, hence, in the working directory, there are a few *.sh scripts to help run them, e.g., to merge run 004 in the data directory, in working type
gebmerge.sh 004
Please note that the file locations are hardcoded at this point so change them if you intend to play around with things!
Running GEBMerge
It is run in the following way (NOTE: A map.dat file is required in execution directory even if it is empty??) :
GEBMerge merge_chat_file.chat merged_output_file.gtd data_file(s).gtd
The merge_chat_file.chat includes a number of parameters defining the sort. My (CHR) current working on is in /music/helios2/DigiOS/working
Also, a more simple way to run the merge is to use the gebmerge.sh script as
gebmerge.sh 004
to sort the data run 004 in the data directory
Runing GEBSort_nogeb
Please note, if you modify GEBSort, you want to make GEBSort_nogeb (NO Global Event Builder), otherwise it will fail because it is meant to get realtime data from Gretina (i think).
GEBSort_nogeb is run by
GEBSort_nogeb -input input_type [input_data_file.gtd] -rootfile output_root_file.root rootfile_option -chat chat_file.chat
- input_type -> disk : to indicate from file (not online)
- input_data_file.gtd : can be merged or single-IOC file (I believe)
- rootfile_option -> RECREATE or UPDATE : whether to create new or add to the root file specified
Other files required by GEBSort_nogeb (all to be placed in working directory)
- map.dat : gives ID’s and types of channels (can be an empty file, but must be there it seems)
GEBSort_nogeb OUTPUT
The present version of the code uses the bin_rcnp.* files to generate a ROOT TTree with the information from the header. There are still some issues with the full readout of the header, but all information for timestamps and energy is there now (extra stuff is for better P/Z corrections, etc.). The root TTree should be nearly identical to the parameters given in the typedef struct DGSEVENT' which is found in GTMerge.h