HPC/Applications/gaussian: Difference between revisions
m (→Version note) |
m (Stern moved page HPC/Applications/Gaussian to HPC/Applications/gaussian) |
||
(32 intermediate revisions by the same user not shown) | |||
Line 15: | Line 15: | ||
features. | features. | ||
=== | === User access === | ||
Argonne's license agreement with Gaussian, Inc. restricts the use of Gaussian applications to '''Argonne employees'''. | |||
=== Versions === | |||
The Gaussian versions on Carbon are 64-bit versions with full support for shared memory and Linda parallelization. | The Gaussian versions on Carbon are 64-bit versions with full support for shared memory and Linda parallelization. | ||
<syntaxhighlight lang="bash"> | <syntaxhighlight lang="bash"> | ||
module -l avail gaussian | module -l avail gaussian | ||
</syntaxhighlight> | </syntaxhighlight> | ||
'''gaussian/09'''/09.D.01.x86_64-3 2015/12/02 17:25:23 | |||
gaussian/16/16-A.03-1 2017/10/16 12:42:41 | |||
gaussian/16/16-A.03-2 2017/10/16 12:35:29 | |||
gaussian/09/09.D.01.x86_64-3 | gaussian/16/16-B.01-1 2019/05/28 16:56:37 | ||
gaussian/16/16-A.03-1 | '''gaussian/16'''/16-C.01-1 2019/08/20 10:03:31 | ||
gaussian/16/16-A.03-2 | |||
gaussian/16/16-B.01-1 | |||
gaussian/16/16-C.01-1 | |||
The last version is the default at present, obtained by the (recommended) version-free module command: | The last version is the default at present, obtained by the (recommended) version-free module command: | ||
Line 34: | Line 33: | ||
module load gaussian | module load gaussian | ||
</syntaxhighlight> | </syntaxhighlight> | ||
== Job script == | == Job script == | ||
* Inspect the job script template (after having loaded the | * Inspect the job script template (after having loaded the module) as follows: | ||
$ | $GAUSSIAN_HOME/g16.job | ||
=== Usage === | === Usage === | ||
Line 49: | Line 43: | ||
* adapt the script for basic needs of several jobs, and individualize it by the PBS job name. | * adapt the script for basic needs of several jobs, and individualize it by the PBS job name. | ||
The job name can be up to 15 characters long and should not contain unusual characters. Set it, along with your project name and the walltime limit on the qsub command line: | The job name can be up to 15 characters long and should not contain unusual characters. Set it, along with your project name and the walltime limit on the qsub command line: | ||
qsub -N foo [-l walltime=hhh:mm:ss] | qsub -N foo [-l walltime=hhh:mm:ss] g16.job | ||
=== Operation === | === Operation === | ||
Upon job execution, the script reads | Upon job execution, the script reads your Gaussian input file and modifies it use under PBS on Carbon, as follows: | ||
* Link0-commands <code>%NProcShared=''ppn''</code> and <code>%LindaWorkers= | * Link0-commands for parallelization are inserted automatically. | ||
* Checkpoint-files | <!-- <code>%NProcShared=''ppn''</code> and <code>%LindaWorkers=n456,n457,n458,...</code> --> | ||
%chk=/tmp/12345. | <!-- | ||
* Checkpoint-files from <code>%chk</code> directives are '''copied into <code>$TMPDIR</code>''', a temporary directory located on the load compute node that is ''job-specifc''. The <code>%chk</code> specification will be changed to include <code>$TMPDIR</code>. Note that this will be echoed in the Gaussian output as follows: | |||
%chk=/tmp/12345.sched5.carboncluster/test0420.chk | |||
: Chk-file processing will be performed only for files within the current directory; to skip processing, specify a relative or absolute path, such as <code>%chk=./name.chk</code> | : Chk-file processing will be performed only for files within the current directory; to skip processing, specify a relative or absolute path, such as <code>%chk=./name.chk</code> | ||
* At the end of the job, chk-files in <code>$TMPDIR</code> are moved to <code>$PBS_O_WORKDIR</code>. | * At the end of the job, chk-files in <code>$TMPDIR</code> are moved to <code>$PBS_O_WORKDIR</code>. | ||
--> | |||
== | == Tool for parsing Gaussian output == | ||
Much of the output from Gaussian is available in plaintext format in the log file. | Much of the output from Gaussian is available in plaintext format in the log file. | ||
Many users write grep-like utilities on their own to extract such information. | Many users write grep-like utilities on their own to extract such information. | ||
I wrote one as well, and it is available alongside each <code>gaussian</code> module. | |||
The tool grew out of my own needs. | |||
Contact me if you encounter a bug, need additional functionality, or simply find it useful. | |||
=== Example === | |||
<syntaxhighlight lang="bash"> | |||
gauss-parse *out | |||
</syntaxhighlight> | |||
<pre> | |||
# d3h_dec.out | |||
# Energies (H) | |||
#it E (H) dE (eV) dE/step (eV) | |||
1 -1421.2018695 0.00000000 0.00000000 | |||
2 -1421.28679099 -2.31083263 -2.31083263 | |||
3 -1421.38689495 -5.03480153 -2.72396890 | |||
… | |||
# job end | |||
</pre> | |||
=== Documentation === | |||
To inspect the tool documentation, run: | |||
gauss-parse --help | gauss-parse --help | ||
gauss-parse --man | gauss-parse --man | ||
To give an idea of scope, here is the | To give an idea of scope, here is the output upon <code>--help</code>: | ||
< | Usage: | ||
Usage: | '''gauss-parse''' [''options''] [''gaussian-output-file'' ...] | ||
Options: | |||
Options: | '''--energy''' Output energies (default). | ||
'''--spectrum''' Output eigenvalues. | |||
'''--diis''' Print DIIS errors. | |||
'''--geometry''' Output geometries (into separate files unless '''--join''' is given). | |||
'''--charge''' Output Mulliken charges. When '''--Spin''' is used, output spin densities instead. | |||
'''--force''' Output forces. | |||
'''--freq''' Output frequencies. | |||
'''--archive''' Expand archive entry [into plain text, with line breaks] | |||
'''--last''' Print output for last frame only. | |||
For '''--energy''', '''--geometry''', or '''--force''' modes only: | |||
'''--pick''' ''frame_number'' | |||
Output only the given frame number (counted from 1). | |||
</ | '''--basename''' ''name'' | ||
Select file name for output (an embedded <code>%d</code> will be increased for each input file) | |||
'''--join''' Output geometry into a multi-frame xyz-format file. | |||
'''--input''' Pick input orientation | |||
'''--std''' Pick standard orientation | |||
'''--Spin''' Use spin occupancy as scalar quantity (default: charge) | |||
'''--units''' ''unit'' | |||
Convert energy and force units (default: no conversion). | |||
… | |||
== Notes on Linda == | == Notes on Linda == | ||
Gaussian is parallelized in two complementary ways: | |||
* on a single node across its processors using ''shared memory'', and | * on a single node across its processors using ''shared memory'', and | ||
* across different nodes via a subset of the ''Linda'' parallelization language – http://gaussian.com/ | * across different nodes via a subset of the ''Linda'' parallelization language – http://gaussian.com/link0/?tabid=1#Linda | ||
=== Incomplete node use === | === Incomplete node use === | ||
Gaussian/Linda may decide not to use the entirety of the nodes made available to it by the preprocess stage in the PBS job file. | |||
This section describes how to determine when this is the case and what to do about it. | This section describes how to determine when this is the case and what to do about it. | ||
Line 114: | Line 138: | ||
#p MP2/6-311G(2df,p) force symm=loose MaxDisk=250000000 test | #p MP2/6-311G(2df,p) force symm=loose MaxDisk=250000000 test | ||
… | … | ||
This being run in the job file through | This being run in the job file through the preprocessor produces the input that Gaussian will see, inserting the node names allocated by PBS: | ||
%NProcShared=32 | |||
%LindaWorkers=n456,n457 | |||
%NProcShared= | |||
%LindaWorkers= | |||
%mem=20mw | %mem=20mw | ||
#p MP2/6-311G(2df,p) force symm=loose MaxDisk=250000000 test | #p MP2/6-311G(2df,p) force symm=loose MaxDisk=250000000 test | ||
… | … | ||
Gaussian will echo its input line by line in its log file: | Gaussian will echo its input line by line in its log file: | ||
Entering Gaussian System, Link 0= | Entering Gaussian System, Link 0=/opt/apps/gaussian/16-C.01-1/libexec/avx2/g16/g16 | ||
Initial command: | Initial command: | ||
/opt/ | /opt/apps/gaussian/16-C.01-1/libexec/avx2/g16/l1.exe "/tmp/3174927.sched5.carboncluster/Gau-38174.inp" -scrdir="/tmp/3174927.sched5.carboncluster/" | ||
Entering Link 1 = /opt/ | Default linda workers: n532,n518 | ||
Default is to use 16 SMP processors on each worker. | |||
Entering Link 1 = /opt/apps/gaussian/16-C.01-1/libexec/avx2/g16/l1.exe PID= 38178. | |||
Copyright (c) 1988, | Copyright (c) 1988-2019, Gaussian, Inc. All Rights Reserved. | ||
This is part of the Gaussian(R) 16 program. It is based on | |||
the Gaussian(R) 09 system (copyright 2009, Gaussian, Inc.), | |||
the Gaussian(R) 03 system (copyright 2003, Gaussian, Inc.), | |||
… | … | ||
Cite this work as: | Cite this work as: | ||
Gaussian | Gaussian 16, Revision C.01, | ||
… | … | ||
****************************************** | ****************************************** | ||
Gaussian 16: ES64L-G16RevC.01 3-Jul-2019 | |||
22-Jan-2025 | |||
****************************************** | |||
%mem= | %mem=1GB | ||
%chk=d3h_dec.chk | |||
SetLPE: new flags=" | SetLPE: new flags="-opt 'Tsnet.Node.lindarsharg: ssh' -nodelist 'n532.carboncluster n518.carboncluster' -env GAUSS_MDEF=134217728 -env GAUSS_EXEDIR="/opt/apps/gaussian/16-C.01-1/libexec/avx2/g16:/opt/apps/gaussian/16-C.01-1/lib/bsd"" | ||
'''Will use up to 2 processors via Linda.''' | '''Will use up to 2 processors via Linda.''' | ||
------------------------------------------- | |||
# | #P ub3lyp/6-31g(d) opt freq scf=nosymm test | ||
------------------------------------------- | ------------------------------------------- | ||
: The last line will be specific to your Gaussian [ | … | ||
* I highly recommend to use the <code>'''# | : The last line will be specific to your Gaussian [https://gaussian.com/keywords/ route section]. | ||
* I highly recommend to use the <code>'''#P'''</code> (= "profuse") option to better track progress through Gaussian's ''link'' stages. | |||
* The term ''processors'' is used inconsistently here. For shared memory, it means ''cores'', but for Linda it means ''nodes''. | * The term ''processors'' is used inconsistently here. For shared memory, it means ''cores'', but for Linda it means ''nodes''. | ||
* You can track the use of Linda by: | * You can track the use of Linda by: | ||
grep exel ''file''.out | grep exel ''file''.out | ||
* Gaussian may decide to ''not use Linda'' after all. A note like this will appear in the log file: | |||
* | |||
PrsmSu: requested '''number of processors reduced to: 6 ShMem 1 Linda.''' | PrsmSu: requested '''number of processors reduced to: 6 ShMem 1 Linda.''' | ||
: In this case, decide wether it would be better to resubmit the job than to leave nodes allocated by PBS idle. It is possible that | : In this case, decide wether it would be better to resubmit the job than to leave nodes allocated by PBS idle. It is possible that Gaussian's decision is different for different Linda links. Only the longest-running link would be relevant for you to evaluate resubmission. | ||
=== Error messages === | === Error messages === | ||
When running jobs on more than one node (i.e., using Linda), you may see messages like these in a job's error stream: | When running jobs on more than one node (i.e., using Linda), you may see messages like these in a job's error stream: | ||
eval server 0 on | eval server 0 on n456.carboncluster has dropped it's connection. | ||
eval server 0 on | eval server 0 on n456.carboncluster has dropped it's connection. | ||
subprocess pid = 11505 has exited. status = 0x0000, id = 0, state = 17. command was /opt/ | subprocess pid = 11505 has exited. status = 0x0000, id = 0, state = 17. command was /opt/..../linda.../opteron … | ||
died after signing in successfully | died after signing in successfully | ||
eval server 0 on | eval server 0 on n456.carboncluster has dropped it's connection. | ||
… | … | ||
* These messages can be ignored, according to http://umbc.rnet.missouri.edu/resources/How2RunGAUSSIAN.html : | * These messages can be ignored, according to http://umbc.rnet.missouri.edu/resources/How2RunGAUSSIAN.html : | ||
Line 176: | Line 196: | ||
</div> | </div> | ||
* The [http://gaussian.com/g_tech/g_ur/m_linda.htm environment variable GAUSS_FLAGS] (as set in the | * The [http://gaussian.com/g_tech/g_ur/m_linda.htm environment variable GAUSS_FLAGS] (as set in the "gaussian" modulefile) is now configured to include the flag <code>-v</code>, to better trace Linda operations on worker nodes. Additional lines like the following will show up in a job's error stream: | ||
ntsnet: using executable file /opt/ | ntsnet: using executable file /opt/apps/gaussian/16-C.01-1/libexec/avx2/g16/l302.exel | ||
ntsnet: trying to schedule 1 worker | ntsnet: trying to schedule 1 worker | ||
ntsnet: scheduled a total of 1 worker | ntsnet: scheduled a total of 1 worker | ||
ntsnet: starting master process on | ntsnet: starting master process on n456.carboncluster | ||
ntsnet: starting 1 worker on | ntsnet: starting 1 worker on n456.carboncluster | ||
: This example shows how one link within a job of <code>#PBS -l nodes=2:ppn= | : This example shows how one link within a job of <code>#PBS -l nodes=2:ppn=16</code> gets executed under Linda. |
Latest revision as of 21:11, January 22, 2025
Introduction
Gaussian is an electronic structure program used by chemists, chemical engineers, biochemists, physicists and others for research in established and emerging areas of chemical interest.
Starting from the basic laws of quantum mechanics, Gaussian predicts the energies, molecular structures, and vibrational frequencies of molecular systems, along with numerous molecular properties derived from these basic computation types. It can be used to study molecules and reactions under a wide range of conditions, including both stable species and compounds which are difficult or impossible to observe experimentally such as short-lived intermediates and transition structures. This article introduces several of its new and enhanced features.
User access
Argonne's license agreement with Gaussian, Inc. restricts the use of Gaussian applications to Argonne employees.
Versions
The Gaussian versions on Carbon are 64-bit versions with full support for shared memory and Linda parallelization.
module -l avail gaussian
gaussian/09/09.D.01.x86_64-3 2015/12/02 17:25:23 gaussian/16/16-A.03-1 2017/10/16 12:42:41 gaussian/16/16-A.03-2 2017/10/16 12:35:29 gaussian/16/16-B.01-1 2019/05/28 16:56:37 gaussian/16/16-C.01-1 2019/08/20 10:03:31
The last version is the default at present, obtained by the (recommended) version-free module command:
module load gaussian
Job script
- Inspect the job script template (after having loaded the module) as follows:
$GAUSSIAN_HOME/g16.job
Usage
Copy the template script and use it in one of two ways:
- edit for each job as needed, or
- adapt the script for basic needs of several jobs, and individualize it by the PBS job name.
The job name can be up to 15 characters long and should not contain unusual characters. Set it, along with your project name and the walltime limit on the qsub command line:
qsub -N foo [-l walltime=hhh:mm:ss] g16.job
Operation
Upon job execution, the script reads your Gaussian input file and modifies it use under PBS on Carbon, as follows:
- Link0-commands for parallelization are inserted automatically.
Tool for parsing Gaussian output
Much of the output from Gaussian is available in plaintext format in the log file. Many users write grep-like utilities on their own to extract such information.
I wrote one as well, and it is available alongside each gaussian
module.
The tool grew out of my own needs.
Contact me if you encounter a bug, need additional functionality, or simply find it useful.
Example
gauss-parse *out
# d3h_dec.out # Energies (H) #it E (H) dE (eV) dE/step (eV) 1 -1421.2018695 0.00000000 0.00000000 2 -1421.28679099 -2.31083263 -2.31083263 3 -1421.38689495 -5.03480153 -2.72396890 … # job end
Documentation
To inspect the tool documentation, run:
gauss-parse --help gauss-parse --man
To give an idea of scope, here is the output upon --help
:
Usage:
gauss-parse [options] [gaussian-output-file ...]
Options:
--energy Output energies (default).
--spectrum Output eigenvalues.
--diis Print DIIS errors.
--geometry Output geometries (into separate files unless --join is given).
--charge Output Mulliken charges. When --Spin is used, output spin densities instead.
--force Output forces.
--freq Output frequencies.
--archive Expand archive entry [into plain text, with line breaks]
--last Print output for last frame only.
For --energy, --geometry, or --force modes only:
--pick frame_number
Output only the given frame number (counted from 1).
--basename name
Select file name for output (an embedded %d
will be increased for each input file)
--join Output geometry into a multi-frame xyz-format file.
--input Pick input orientation
--std Pick standard orientation
--Spin Use spin occupancy as scalar quantity (default: charge)
--units unit
Convert energy and force units (default: no conversion).
…
Notes on Linda
Gaussian is parallelized in two complementary ways:
- on a single node across its processors using shared memory, and
- across different nodes via a subset of the Linda parallelization language – http://gaussian.com/link0/?tabid=1#Linda
Incomplete node use
Gaussian/Linda may decide not to use the entirety of the nodes made available to it by the preprocess stage in the PBS job file. This section describes how to determine when this is the case and what to do about it.
Consider the following input:
%mem=20mw #p MP2/6-311G(2df,p) force symm=loose MaxDisk=250000000 test …
This being run in the job file through the preprocessor produces the input that Gaussian will see, inserting the node names allocated by PBS:
%NProcShared=32 %LindaWorkers=n456,n457 %mem=20mw #p MP2/6-311G(2df,p) force symm=loose MaxDisk=250000000 test …
Gaussian will echo its input line by line in its log file:
Entering Gaussian System, Link 0=/opt/apps/gaussian/16-C.01-1/libexec/avx2/g16/g16 Initial command: /opt/apps/gaussian/16-C.01-1/libexec/avx2/g16/l1.exe "/tmp/3174927.sched5.carboncluster/Gau-38174.inp" -scrdir="/tmp/3174927.sched5.carboncluster/" Default linda workers: n532,n518 Default is to use 16 SMP processors on each worker. Entering Link 1 = /opt/apps/gaussian/16-C.01-1/libexec/avx2/g16/l1.exe PID= 38178. Copyright (c) 1988-2019, Gaussian, Inc. All Rights Reserved. This is part of the Gaussian(R) 16 program. It is based on the Gaussian(R) 09 system (copyright 2009, Gaussian, Inc.), the Gaussian(R) 03 system (copyright 2003, Gaussian, Inc.), … Cite this work as: Gaussian 16, Revision C.01, … ****************************************** Gaussian 16: ES64L-G16RevC.01 3-Jul-2019 22-Jan-2025 ****************************************** %mem=1GB %chk=d3h_dec.chk SetLPE: new flags="-opt 'Tsnet.Node.lindarsharg: ssh' -nodelist 'n532.carboncluster n518.carboncluster' -env GAUSS_MDEF=134217728 -env GAUSS_EXEDIR="/opt/apps/gaussian/16-C.01-1/libexec/avx2/g16:/opt/apps/gaussian/16-C.01-1/lib/bsd"" Will use up to 2 processors via Linda. ------------------------------------------- #P ub3lyp/6-31g(d) opt freq scf=nosymm test ------------------------------------------- …
- The last line will be specific to your Gaussian route section.
- I highly recommend to use the
#P
(= "profuse") option to better track progress through Gaussian's link stages. - The term processors is used inconsistently here. For shared memory, it means cores, but for Linda it means nodes.
- You can track the use of Linda by:
grep exel file.out
- Gaussian may decide to not use Linda after all. A note like this will appear in the log file:
PrsmSu: requested number of processors reduced to: 6 ShMem 1 Linda.
- In this case, decide wether it would be better to resubmit the job than to leave nodes allocated by PBS idle. It is possible that Gaussian's decision is different for different Linda links. Only the longest-running link would be relevant for you to evaluate resubmission.
Error messages
When running jobs on more than one node (i.e., using Linda), you may see messages like these in a job's error stream:
eval server 0 on n456.carboncluster has dropped it's connection. eval server 0 on n456.carboncluster has dropped it's connection. subprocess pid = 11505 has exited. status = 0x0000, id = 0, state = 17. command was /opt/..../linda.../opteron … died after signing in successfully eval server 0 on n456.carboncluster has dropped it's connection. …
- These messages can be ignored, according to http://umbc.rnet.missouri.edu/resources/How2RunGAUSSIAN.html :
These messages are not indicative of a problem. They indicate that the Linda work in a Gaussian link is finished, and that Gaussian is continuing with a new link.
- The environment variable GAUSS_FLAGS (as set in the "gaussian" modulefile) is now configured to include the flag
-v
, to better trace Linda operations on worker nodes. Additional lines like the following will show up in a job's error stream:
ntsnet: using executable file /opt/apps/gaussian/16-C.01-1/libexec/avx2/g16/l302.exel ntsnet: trying to schedule 1 worker ntsnet: scheduled a total of 1 worker ntsnet: starting master process on n456.carboncluster ntsnet: starting 1 worker on n456.carboncluster
- This example shows how one link within a job of
#PBS -l nodes=2:ppn=16
gets executed under Linda.