HPC/Applications/g09: Difference between revisions

From CNM Wiki
Jump to navigation Jump to search
mNo edit summary
mNo edit summary
Line 44: Line 44:
* At the end of the job, chk-files in <code>$TMPDIR</code> are moved to <code>$PBS_O_WORKDIR</code>.
* At the end of the job, chk-files in <code>$TMPDIR</code> are moved to <code>$PBS_O_WORKDIR</code>.


== A note on Linda errors ==
== Notes on Linda ==
=== Dynamic use ===
Gaussian will echo its input line by line in its log file. This will look like:
; input:
%mem=20mw
#p MP2/6-311G(2df,p) force symm=loose MaxDisk=250000000 test
; log file (standard output):
Entering Gaussian System, Link 0=g09
Initial command:
/opt/soft/g09-D.01.x86_64-1/g09/l1.exe "/tmp/341710.sched1.carboncluster/Gau-32350.inp" -scrdir="/tmp/341710.sched1.carboncluster/"
Entering Link 1 = /opt/soft/g09-D.01.x86_64-1/g09/l1.exe PID=    32351.
 
Copyright (c) 1988,1990,1992,1993,1995,1998,2003,2009,2013,
            Gaussian, Inc.  All Rights Reserved.
 
This is part of the Gaussian(R) 09 program.  It is based on
Cite this work as:
Gaussian 09, Revision D.01,
  ******************************************
Gaussian 09:  EM64L-G09RevD.01 24-Apr-2013
                1-Aug-2013
******************************************
%NProcShared=8
'''Will use up to    8 processors via shared memory.'''
%LindaWorkers=n994,n992
%mem=20mw
SetLPE:  input flags="-v -opt "Tsnet.Node.lindarsharg: ssh" "
SetLPE:    new flags="-v -opt "Tsnet.Node.lindarsharg: ssh"  -nodelist 'n994.carboncluster n992.carboncluster'"
'''Will use up to    2 processors via Linda.'''
--------------------------------------------------------------------
#p MP2/6-311G force …
--------------------------------------------------------------------
: The last line will be specific to your Gaussian [http://www.gaussian.com/g_tech/g_ur/m_input.htm route section]. So far, so good.
* I highly recommend to use the <code>'''#p'''</code> ("profuse") flag to better track progress through Gaussian's ''link'' stages.
* The term ''processors'' is used inconsistently here. For shared memory, it means ''cores'', but for Linda it means ''nodes''.
* You can track the use of Linda
* g09 may decide to ''not use Linda'' after all. A note like this will appear in the log file:
PrsmSu:  requested '''number of processors reduced to:  6 ShMem  1 Linda.'''
 
=== Error messages ===
When running jobs on more than one node (i.e., using Linda), you may see messages like these in a job's error stream:
When running jobs on more than one node (i.e., using Linda), you may see messages like these in a job's error stream:
  eval server 0 on n019.carboncluster has dropped it's connection.
  eval server 0 on n019.carboncluster has dropped it's connection.

Revision as of 18:27, August 12, 2013

Introduction

Gaussian is an electronic structure program used by chemists, chemical engineers, biochemists, physicists and others for research in established and emerging areas of chemical interest.

Starting from the basic laws of quantum mechanics, Gaussian predicts the energies, molecular structures, and vibrational frequencies of molecular systems, along with numerous molecular properties derived from these basic computation types. It can be used to study molecules and reactions under a wide range of conditions, including both stable species and compounds which are difficult or impossible to observe experimentally such as short-lived intermediates and transition structures. This article introduces several of its new and enhanced features.

Version note

The version on Carbon is 64-bit version with full support for shared memory and Linda parallelization, specifically at time of writing version Gaussian 09: EM64L-G09RevC.01 from 23-Sep-2011.

Status

Our license agreement with Gaussian, Inc. restricts the use of Gaussian applications to Argonne employees.

Job script

  • job script template:
$G09_HOME/g09.job
  • This script shows how to use a preprocessing script
g09preprocess

Usage

Copy the template script and use it in one of two ways:

  • edit for each job as needed, or
  • adapt the script for basic needs of several jobs, and individualize it by the PBS job name.

The job name can be up to 15 characters long and should not contain unusual characters. Set it, along with your project name and the walltime limit on the qsub command line:

qsub -N foo [-l walltime=hhh:mm:ss]  g09.job

Operation

Upon job execution, the script reads the input file, and the g09preprocess script interprets and modifies your input for use under PBS on Carbon, as follows:

  • Link0-commands %NProcShared=ppn and %LindaWorkers=n123,n124,n125,... are inserted automatically.
  • Checkpoint-files named in %chk directives are identified and copied into the compute node's job-specifc $TMPDIR. The %chk specification will be changed to include $TMPDIR. Note that this will be echoed in the g03 output as follows:
 %chk=/tmp/12345.sched1.carboncluster/test0420.chk
Chk-file processing will be performed only for files within the current directory; to skip processing, specify a relative or absolute path, such as %chk=./name.chk
  • At the end of the job, chk-files in $TMPDIR are moved to $PBS_O_WORKDIR.

Notes on Linda

Dynamic use

Gaussian will echo its input line by line in its log file. This will look like:

input
%mem=20mw
#p MP2/6-311G(2df,p) force symm=loose MaxDisk=250000000 test
…
log file (standard output)
Entering Gaussian System, Link 0=g09
Initial command:
/opt/soft/g09-D.01.x86_64-1/g09/l1.exe "/tmp/341710.sched1.carboncluster/Gau-32350.inp" -scrdir="/tmp/341710.sched1.carboncluster/"
Entering Link 1 = /opt/soft/g09-D.01.x86_64-1/g09/l1.exe PID=     32351.
 
Copyright (c) 1988,1990,1992,1993,1995,1998,2003,2009,2013,
           Gaussian, Inc.  All Rights Reserved.
 
This is part of the Gaussian(R) 09 program.  It is based on
…
Cite this work as:
Gaussian 09, Revision D.01,
…
 ******************************************
Gaussian 09:  EM64L-G09RevD.01 24-Apr-2013
                1-Aug-2013 
******************************************
%NProcShared=8
Will use up to    8 processors via shared memory.
%LindaWorkers=n994,n992
%mem=20mw
SetLPE:  input flags="-v -opt "Tsnet.Node.lindarsharg: ssh" "
SetLPE:    new flags="-v -opt "Tsnet.Node.lindarsharg: ssh"  -nodelist 'n994.carboncluster n992.carboncluster'"
Will use up to    2 processors via Linda.
--------------------------------------------------------------------
#p MP2/6-311G force …
--------------------------------------------------------------------
The last line will be specific to your Gaussian route section. So far, so good.
  • I highly recommend to use the #p ("profuse") flag to better track progress through Gaussian's link stages.
  • The term processors is used inconsistently here. For shared memory, it means cores, but for Linda it means nodes.
  • You can track the use of Linda
  • g09 may decide to not use Linda after all. A note like this will appear in the log file:
PrsmSu:  requested number of processors reduced to:   6 ShMem   1 Linda.

Error messages

When running jobs on more than one node (i.e., using Linda), you may see messages like these in a job's error stream:

eval server 0 on n019.carboncluster has dropped it's connection.
eval server 0 on n019.carboncluster has dropped it's connection.
subprocess pid = 11505 has exited. status = 0x0000, id = 0, state = 17. command was /opt/soft/g09-D.01.x86_64-1/g09/linda8.2/opteron …
died after signing in successfully
eval server 0 on n019.carboncluster has dropped it's connection.
…

These messages are not indicative of a problem. They indicate that the Linda work in a Gaussian link is finished, and that Gaussian is continuing with a new link.

  • The environment variable GAUSS_FLAGS (as set in the g09 modulefile) is now configured to include the flag -v, to better trace Linda operations on worker nodes. Additional lines like the following will show up in a job's error stream:
ntsnet: using executable file /opt/soft/g09-D.01.x86_64-1/g09/linda-exe/l302.exel
ntsnet: trying to schedule 1 worker
ntsnet: scheduled a total of 1 worker
ntsnet: starting master process on n994.carboncluster
ntsnet: starting 1 worker on n992.carboncluster
This example shows how one link within a job of #PBS -l nodes=2:ppn=8 gets executed under Linda.