HPC/Module naming scheme 2016: Difference between revisions

From CNM Wiki
< HPC
Jump to navigation Jump to search
mNo edit summary
 
(205 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Introduction ==
= Properties =
{| class="wikitable" style="width: 20%; float: right"
{| class="wikitable" style="width: 20%; float: right"
| (*) ''Environment modules'' are the means by which software applications are made available to users. Using modules allows users and administrators to pick, by user or even by compute job, a desired or default version of an application from typically several current and historic versions persisting on the system. Older versions are kept to improve reproducibility of results, an important characteristic of scientific computing.
| (*) ''Environment modules'' are the means by which software applications are made available to users. Using modules allows users and administrators to pick, by user or even by compute job, a desired or default version of an application from typically several current and historic versions persisting on the system. Older versions are kept to improve reproducibility of results, an important characteristic of scientific computing.
|}
|}
On Carbon, the environment modules* system has changed in the following aspects:
* the naming scheme,
* default and dependent modules no longer being loaded,
* minor associated command behaviors.
The changes were necessary because of increasing diversity and interdependence of applications, their modules, and the underlying operating system.
The goal of the changed naming scheme has been to:
* expand the schemes for (a) installation directories and (b) module names to enable addressing and chosing versions by differing:
** OS release,
** compilers,
** MPI flavors, and,
** for future evolution: different aspects of the machine architecture like CPU generation, capabilities, and coprocessor facilities,
* accommodate and enable ''existing'' application versions to run on the newer OS release,
* install and make available ''new'' application versions where suitable, either on both old and newer OS releases, or only one of them.
The naming scheme and usage conventions for environment modules on Carbon are as follows.


== Naming scheme (Nomenclature) ==
== Naming scheme (Nomenclature) ==
The full name of a module has several name components that are separated by a slash <code>/</code>. The applications's main name and author-provided version, along with a local build number, form the first and last component of the module name. Other name components, if present, make apparent to the user which set of major tools was used to produce the application locally,
=== Overview ===
which usually translates to which modules are needed to run the application.
The full name of a module has two or more components separated by slashes <code>/</code>, in one of the following forms:
 
In detail, module names have one of the following forms:
  ''name/api/version-build'' # binary packages, compilers
  ''name/api/version-build'' # binary packages, compilers
  ''name/api/compiler/version-build'' # compiled applications
  ''name/api/compiler/version-build'' # compiled applications
  ''name/api/mpi/compiler/version-build'' # compiled applications that use MPI
  ''name/api/mpi/compiler/version-build'' # compiled applications that use MPI
* <code>''name''</code> is the package's name as chosen on ''Carbon''. It is usually all lowercase and may contain a number if so named by the software author, e.g. <code>fftw3</code>.
* The first component is the applications's ''main name'', usually as chosen by its author.
* <code>''api''</code> is the leading part of the pacakge version, which typically signifies the [https://en.m.wikipedia.org/wiki/Application_programming_interface API] level across which different versions are expected to be compatible. It is typically a sole ''major'' version number, or has the form ''major.minor''. Loading a module that has the full name <code>foo/m.n/compiler/version-build</code> by the abbreviated name <code>foo/m.n</code> enables you to select the features and binary compatibility level that you need without having to give a complete module name all the way down to a build number.
* The last component is the ''version number'', also usually chosen by the author, followed by a ''build identifier'' chosen on ''Carbon''.
* <code>''compiler''</code> is a component that is present only ''as needed'' when an application was ''compiled'' here and needs runtime libraries associated with the compiler used. The compiler name component is typically not present for applications installed from binary packages, notably commercial applications and, naturally, compilers themselves.
* Other name components may be present and indicate which set of major tools were used to produce the application locally, which usually implies which modules are required to be loaded to run the application.
* <code>''mpi''</code>, also ''as neeeded'', is present to denote an [https://en.m.wikipedia.org/wiki/Message_Passing_Interface MPI] flavor in use for parallel computations.
For reference, the previous scheme has been as follows:
''name/version-build''


'''''Example:''''' Names for the FFT3 library module in the old and new naming schemes, queried by the <code>module avail</code> shell command:
=== Details ===
{| class="wikitable" <!-- style="width: 50%;" -->
* <code>''name''</code> is the package's name as chosen on ''Carbon''. It is the name given by the software's author, lowercased for consistency. It may contain numbers if they are customarily part of the name, <code>fftw3</code> being a prime example.
! style="width: 50%;" | Scheme 2 (current)
* <code>''api''</code> is the leading part or parts of the package's version number which indicates to suitable precision the [https://en.m.wikipedia.org/wiki/Application_programming_interface API] level across which different package versions are expected to be compatible (interchangable in terms of features). The <code>''api''</code> component typically has one of these forms:
! style="width: 50%;" | Scheme 1 (being retired)
*: <code>''major''</code>
|-
*: <code>''major.minor''</code>
| style="vertical-align:top;" |
: The specificity is subject to an administrator's intuition and understanding of the intentions of the package's author, and may well turn out to be incorrect in the future after unexpected turns in a package's development process (''caveat emptor'').
$ '''module -t avail fftw3'''
* <code>''compiler''</code> is a name component that is present when an application was ''compiled'' here and thus usually needs runtime libraries associated with the compiler used. The <code>''compiler''</code> name component is not strictly needed for applications that are statically linked, or come with all their own libraries, but can be present even then for informative purposes. The name component is typically ''absent'' for applications installed as binaries, notably commercial applications and, naturally, compilers themselves. The name component typically has sub-components of the form:
<span style="color:#888;">/opt/apps/M/x86_64/EL6:</span>
*: <code>''compilerNAME''-''compilerAPI''</code>
<span style="color:#999;">/opt/apps/M/x86_64/EL:</span>
* <code>''mpi''</code>, present ''when neeeded'', denotes the [https://en.m.wikipedia.org/wiki/Message_Passing_Interface MPI] flavor in use for parallel computationsThis name component typically also has sub-components of the form:
fftw3/3.3/'''''impi-5'''''/intel/3.3.4-10 # uses Intel-MPI
*: <code>''mpiNAME''-''mpiAPI''</code>
fftw3/3.3/intel/3.3.2-1 # legacy, serial only
<!-- == Changes requiring your attention ==
fftw3/3.3/'''''openmpi-1.10'''''/intel/3.3.4-11 # uses OpenMPI
The nitty-gritty [[HPC/Module changes 2016 – Details|'''details of the changes''']] are listed separately.
fftw3/3.3/'''''openmpi-1.4'''''/intel/3.3.2-4 # legacy, OpenMPI
-->
<span style="color:#777;">/usr/share/Modules/modulefiles:</span>
<span style="color:#777;">/etc/modulefiles:</span>
|
$ '''module -t avail fftw3'''
<span style="color:#888;">/opt/soft/modulefiles:</span>
fftw3/3.2.1-1
fftw3/3.2.2-1
fftw3/3.3-1
fftw3/3.3.2-1
  fftw3/3.3.2-4
<span style="color:#777;">/opt/intel/modulefiles:</span>
<span style="color:#777;">/usr/share/Modules/modulefiles:</span>
<span style="color:#777;">/etc/modulefiles:</span>
|-
|}
: The <code>-t</code> option of <code>module avail</code> shows the output in "terse" form, one entry per line. Lines ending in ":" are the search directories traversed, as defined <!--at the system level -->by <code>module use ''dirname''</code> statements.


== Module loading ==
<!-- {| class="wikitable" style="width: 20%; float: right"
| (**) Technically, the system ''does load'' the module <code>profile/user</code> for you. This module only contains the instructions to select and read the appropriate <code>.modules-*</code> file.
|} -->
=== No pre-loading ===
No compiler or MPI modules are pre-loaded by the system.
The system only loads the "meta" module <code>profile/user</code> for you,
which merely looks for and loads  your own <code>.module-*</code> files.


You may need to adapt the module names that you placed in your shell startup and job files to the new more hierarchical scheme.
=== No recursive loading ===
For most modules, with exceptions shown below, the leading name component (the part before any "/") is the same in the old and new naming schemes.
A module in general does not load other modules that it might depend on,
What always differs are the name parts after the first slash.
such as modules for compilers, an MPI flavor, or specialized libraries.
<!-- for <code>module load ''name''</code> commands  -->


=== Name change rules for most modules ===
== System-specific command files ==
* To use the latest or automatically selected version of a package, remove version numbers from old-style module names of the form<code>''packagename/version''</code>, leaving only <code>''packagename''</code>. This is the recommended approach, as you will automatically benefit from future updates and maintenance builds.
<!-- How to customize your module selection by OS release)
* To insist on a specific version for a package in new style names:
Previously, modules were loaded from the <code>~/.bashrc</code> file and, with some caveats, from PBS job files.
** Inspect the available flavors and versions (some older modules were not migrated):
With different OS releases active in the cluster, it is now recommended to place module commands into OS-specific files.
**: <code>module avail ''packagename''</code>.
-->
** Choose the new-style name up to the desired specificity. You may leave out trailing name or directory parts.
For a time, nodes with different operating systems and therefore more or less
**: For instance, instead of <code>vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-3</code> you may write <code>vasp5/5.3/openmpi-1.4</code> or <code>vasp5/5.3</code>, letting the system choose the versions for MPI and compiler that are chosen as defaults at a given time.
''different module catalogs'' will coexist in the cluster.
Since you will always have the same home directory on each node, most of your
script files would have to be written so they run on either operating system.
This would mean having to code <code>if</code> statements in your scripts,
which can be difficult and fragile to keep up.
To simplify conditional module selection, each node on ''Carbon'' now looks for
''specific file names'' in your home directory. Depending on which files exist, a node:
# determines whether to use the legacy or new naming scheme, and then
# loads the file that is appropriate for the operating system running on that node, in the scheme determined.


=== Name change exceptions ===
Specifically:
For the following modules the newer naming convention allowed for and thus uses more consistent names:
* The mere existence of files of the form <code>~/.modules-el''x''</code>, with ''x'' = 5,6,..., activates the new naming scheme. Subsequently, the file will be loaded and interpreed (in Tcl language) on the corresponding OS.
<source lang="bash">
* On CentOS-5 nodes, a file <code>~/.modules-el5-legacy</code> will trigger the legacy scheme, but only if <code>~/.modules-el5</code> is not present. In other words, a <code>~/.modules-el5</code> file has priority and causes <code>~/.modules-el5-legacy</code> to be ignored.
OLD NEW
* Without any <code>~/.modules-*</code> files, CentOS-5 nodes will use the legacy scheme; CentOS-6 nodes always use the new scheme.
-------------------------------------
* Your <code>.bashrc</code> file will always be read, expecting the naming scheme that was determined by the presence or absence of <code>.modules-*</code> files.
asap3 asap/3
* When running a PBS job, module commands in the job file will be read, in the same naming scheme as <code>.bashrc</code>.


ase2 ase/2
ase3 ase/3
g09 gaussian/09
GaussView gaussview  (lowercase)
</source>
<!--
<!--
fftw3 fftw/3.3
Place your desired  <code>module load</code> commands in ''specific files'', as follows:
vasp vasp/4
vasp5 vasp/5
: Note: Licensees of Vasp-4 only ''must'' specify vasp'''/4'''. The default module for "vasp" is under vasp/5.
-->
: The modules <code>fftw3</code> and <code>vasp5</code> did ''not'' change name due to more entrenched usage in the package itself, Unix group names, and compilation dependencies.
 
 
== Active naming scheme varies by ''operating system'' ==
During a transition period, nodes with different operating systems will coexist in the cluster.
Each user, nonetheless, will see the same networked home directory on each node regardless of the OS that the node runs.
Therefore, the user's shell and module configuration files (like <code>.bashrc</code> and others)
will be interpreted by system utilities, end user applications, and in runtime environments that all ''differ by operating system'', to a varying degree.
 
=== Solution ===
The module scheme that will be used on a node is primarily determined by its operating system, secondarily by the user:
* On nodes running CentOS-6, both login and compute, scheme 2 is used always.
* On CentOS-5 nodes, scheme 1 is normally used. A user can customize or upgrade to scheme 2 by creating configuration files which will be read within the scheme they activate.
 
=== Caveats ===
* The home directory is the same across nodes, and therefore a user's configuration scripts are sensitive to differences between OS releases.
* Applications will typically be compiled and installed on a node that runs the more recent OS release. Some compiled applications are backwards-compatible with a previous OS release, and will be made available there. If we find in practice that this fails, the respective module will be moved such that it is visible only on the suitable OS release. Scheme 2 will appropriately offer applications that run on ''both'' or only a ''specific'' release of the operating system.


=== Scheme selection rules ===
In summary, the files will be detected and read as follows:
{| class="wikitable" style="text-align:center;  margin: 1em auto 1em auto;"
: {| class="wikitable"
! rowspan=2 colspan=2 | You have files:<br/>.bashrc and …
! rowspan=2 colspan=2 | You have files:<br/>.bashrc and …
! style="color:#888;" rowspan=2 | Remark
! style="color:#888;" rowspan=2 | Remark
! colspan=2 | CentOS-6 uses:
! colspan=2 | CentOS-5 reads:
! colspan=2 | CentOS-5 uses:
! colspan=2 | CentOS-6 reads:
! colspan=2 | CentOS-7 reads:
|-
|-
! files
! module names
! files
! files
! module names
! module names
Line 128: Line 82:
! module names
! module names
|-
|-
| – || – || style="color:#888; text-align: left;" | Starting situation. || style="background:#ddf;" | .bashrc only || style="background:#ddf;" rowspan=4 | scheme 2 || style="background:#ffd;" | .bashrc only || style="background:#ffd;" | scheme 1
| – || – || style="color:#888; text-align: left;" | Starting situation || style="background:#ffd;" | only .bashrc || style="background:#ffd;" | legacy scheme || style="background:#ddf;" | only .bashrc || style="background:#ddf;" rowspan=3 | new scheme || style="color:#888; background:#ddd; text-align:center;" rowspan=3 colspan=2 | To be determined.
|-                                                                                                                                                                                                                             
|-
| – || .modules-2 || style="color:#888; text-align: left;" | Switch over, ''recommended.'' || style="background:#ddf;" | .modules-2 and .bashrc || style="background:#ddf;" | .modules-'''2''' and .bashrc || style="background:#ddf;" | scheme 2
| '''.modules-el5-legacy''' || '''.modules-el6''' || style="color:#888; text-align: left;" | Compatibility scheme || style="background:#ffd;" | .modules-el5-legacy and .bashrc || style="background:#ffd;" | scheme '''1''' || style="background:#ddf;" | .modules-el6 and .bashrc
|-                                                                                                                                                                                                                             
|-
| .modules-1 || – || style="color:#888; text-align: left;" | Only recommended during transition. || style="background:#ddf;" | .bashrc only || style="background:#ffd;" | .modules-1 and .bashrc || style="background:#ffd;" | scheme 1
| '''.modules-el5''' || '''.modules-el6''' || style="color:#888; text-align: left;" | Full switch-over, ''recommended'' || style="background:#ddf;" | .modules-'''el5''' and .bashrc || style="background:#ddf;" | scheme '''2''' || style="background:#ddf;" | .modules-el6 and .bashrc
|-                                                                                                                                                                                                                              
|-
| .modules-1 || .modules-2 || style="color:#888; text-align: left;" | For advanced users. || style="background:#ddf;" | .modules-2 and .bashrc || style="background:#ffd;" | .modules-1 and .bashrc || style="background:#ffd;" | scheme 1
|}
|}
-->
; Tips:
* To load the ''same'' modules on any node, in the new scheme:
** Place your configuration in <code>~/.modules-el6</code>, then create a symbolic link:
*: <code>cd; ln -s .modules-el6  .modules-el5</code>
** It is possible but not recommended (because less future-proof) to keep your module selection in <code>~/.bashrc</code>, and activate the new scheme on CentOS-5 nodes by simply creating an empty configuration: <code>touch ~/.modules-el5</code>
<!--
* '''Caution:''' Avoid the following file constellations because they can easily become confusing:
:* Only one of these files present: <code>.modules-el5-legacy</code>, or <code>.modules-el5</code>, or <code>.modules-el6</code>.
:* Both <code>.modules-el5-legacy</code> and <code>.modules-el5</code> present. While helpful for transitioning, remember that the former file will be ignored as soon as the latter exists.
: You may get errors on nodes that do not read these files, or you might find that you need conditional logic in your .bashrc file.  Recall that your home directory is the same across nodes, and therefore your configuration scripts are sensitive to differences between OS releases.
-->


== Available modules ''differ between naming schemes'' ==
== Workflow to determine module names to load ==
* Newer modules will only be made available in scheme 2.
{| class="wikitable" style="width: 20%; float: right"
* A subset of modules from scheme 1 have been ported to scheme 2, typically only modules representing the most recent version of an application.
| (***) '''Caution:''' Module names are typically sorted as text strings the same way as Unix ls(1) does it. The resulting order may be counter-intuitive when the same part of several version numbers has a different number of digits. In the following example, version 8.16.x is the most recent release and needed to be administrator-designated as default because character for character, the string 8.7.x would be sorted highest.<br><code>$ module avail lumerical<br>...<br>lumerical/7.5.7-1<br>lumerical/8.0.5-1<br>lumerical/8.15.736-1<br>lumerical/'''8.16.931-1(default)'''<br>lumerical/8.5.4-1<br>lumerical/8.7.1-1<br>...</code>
* The previous module naming scheme 1 is being retired, along with some of its attendant conventions.
|}
To determine a suitable module name for a desired package:
# On a ''Carbon'' command line, list the available flavors and versions, keeping in mind that some older modules were not migrated:
#: <code>module avail</code>
#: <code>module avail ''packagename''</code>
#: (When upgrading from the previous naming scheme, remove version numbers from names of the form<code>''packagename/version''</code>, leaving only <code>''packagename''</code>.)
# Use the <code>module show</code> command to inspect details of a module, particularly its full name:
#: <code>module show ''name/api''</code>
#: The command will use the name you gave to determine a suitable subset of all available modules, pick either a designated default or the highest-sorting version(***) from that subset, and finally show the details for that single version only.
# Complete the desired package's name by appending API, MPI, and/or compiler specifications as needed, and repeat the previous step.
# [[#Understanding dependency errors|Determine a package's dependencies]].
#: Inspect the output of <code>module show …</code>, look for any <code>prereq</code> statements, and load those on a previous line.
'''Only give full versions if:''' you require a build with a specific feature or behavior (such as to reproduce prior results with numeric consistency). To do so:
:5. Append version and build specifications, as shown by the <code>module avail ''packagename''</code> command.
<!--
** Choose the new-scheme name up to the desired specificity. You may leave out trailing name or directory parts.
-->
 
; Example:
:* Let's load a vasp5 module that uses the Intel-MPI flavor ("impi"):
$ '''module avail vasp5'''
------------------------------------------------------------ /opt/apps/M/x86_64/EL ------------------------------------------------------------
vasp5/5.3/openmpi-1.4/intel/5.3.2-mkl-beef-1    '''''vasp5/5.4/impi-5/intel-16/5.4.1.3-6'''''
vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-3      vasp5/5.4/openmpi-1.10/intel-16/5.4.1.3-6
vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-cellz-1
:* Let's see which version would be loaded using an abbreviated name:
$ '''module show vasp5/5.4'''
-------------------------------------------------------------------
/opt/apps/M/x86_64/EL/vasp5/5.4/'''''openmpi-1.10'''''/intel-16/5.4.1.3-6:
:* Careful: The first output line shows the full file name of the module that would get loaded by the short name. In this case, the abbreviated module name, having no MPI name component, yields a module that uses a different MPI flavor than you want.
:* You will need to be more explicit:
$ '''module show vasp5/5.4/impi-5'''
-------------------------------------------------------------------
/opt/apps/M/x86_64/EL/'''''vasp5/5.4/impi-5/intel-16'''''/5.4.1.3-6:
module-whatis VASP - Vienna Ab-initio Simulation Package
conflict vasp
conflict vasp-vtst
'''''prereq intel/16'''''
'''''prereq impi/5'''''
setenv VASP5_HOME /opt/apps/vasp5/5.4.1.3-6-impi-5-intel-16
prepend-path PATH /opt/apps/vasp5/5.4.1.3-6-impi-5-intel-16/bin
setenv VASP_COMMAND vasp-ase
setenv VASP_PP_PATH /opt/soft/vasp-pot/ase
-------------------------------------------------------------------
:* Therefore, you'd need to add the following lines to your <code>.modules-el6</code> file:
module load '''''intel/16'''''
module load '''''impi/5'''''
module load '''''vasp5/5.4/impi-5'''''


== No default modules loaded by the system ==
== Best Practices ==
* Users must load all applicable modules in their shell setup files or job files, in suitable order.
=== Use migration utility ===
: This is born of necessity because still useful older applications were compiled with older MPI versions (typically OpenMPI-1.4) which partially interfere with newer versions (OpenMPI-1.8, 1.10, or Intel-MPI-5.x). In particular, commands like <code>mpirun</code> and <code>mpifort</code> are typically provided by each MPI flavor.
Use a helper utility to get started on splitting off and diversifying your existing module selection from .bashrc into .modules-* files:
'''modules-migrate'''
: The utility will manage the following files:
.bashrc
.modules-el5-legacy
.modules-el6
: See an [[Sandbox/Migration example|'''example output''']] of running the utility.
* Please note that the utility is fairly basic and cannot transform or choose versions and their dependencies as described here.
* The utility will give you the opportunity to inspect and edit the resulting files. Use a text editor of your choice, such as <code>nano</code> or <code>vi</code> to re-examine or edit these files further.
* To switch to the new scheme on all nodes, create or copy from <code>.modules-el6</code>:
.modules-el5
* [[#Test your module choices]], as shown [[#Test your module choices|below]].


== Modules do not load dependencies ==
=== Omit detailed versions from module names ===
* A module does not implicitly load other modules that it might depend on, such as modules for compilers, an MPI flavor, or specialized libraries. While a minor burden for the user to specify, this will make operations more explicit and predictable.
When constructing a <code>module load</code> command, try to omit detailed version and build numbers from the end, i.e., load a module that has the full name <code>foo/m.n/compiler/version-build</code> by an abbreviated name <code>foo/m.n/compiler</code>.
* Learn to recognize error messages issued by <code>module load</code> when a dependent module is found missing. For instance:
: Module names that are abbreviated in this manner will be completed at the time of loading to select a default, which is either a version designated as such by an administrator or simply the version with the highest version number. In any case, with abbreviated module names you will benefit from newer modules that have been installed since you last looked. Version numbers are generally chosen by package authors so that packages with the same major version number are binary-compatible.
 
For instance, instead of:
module load intel/16/16.0.0-3
module load openmpi/1.10/intel-16/1.10.0-4
Write:
module load intel'''/16'''
module load openmpi'''/1.10/intel-16'''
 
It is preferable to supply the compiler name part of MPI modules (here <code>…/intel-16</code>) because they usually both (a) need compiler libraries and (b) impliclity use their native compiler for further compilations.
 
=== Modules load order ===
To meet module dependencies, edit your configuration or job files to load required modules first, in this order:
# compilers
# MPI flavor
# other libraries that are dynamically loaded.
# your desired application(s).
 
=== Understanding dependency errors ===
Learn to recognize error messages from <code>module load</code> when a required module has not been loaded:
 
'''''Example:''''' A typical error message will look like:
  $ '''module load openmpi/1.10'''
  $ '''module load openmpi/1.10'''
  openmpi/1.10/intel-16/1.10.2-1(27):ERROR:151: Module '<span style="color:brown;">openmpi/1.10/intel-16/1.10.2-1</span>' '''''depends on one of the module(s)''''' '<span style="color:blue;">intel/16/16.0.2 intel/16/16.0.1-2 intel/16/16.0.0-3 intel/16/16.0.0-1 intel/16/16.0.0-0</span>'
  openmpi/1.10/intel-16/1.10.2-1(27):ERROR:151: Module '<span style="color:brown;">openmpi/1.10/intel-16/1.10.2-1</span>' '''''depends on one of the module(s)''''' '<span style="color:blue;">intel/16/16.0.2 intel/16/16.0.1-2 intel/16/16.0.0-3 intel/16/16.0.0-1 intel/16/16.0.0-0</span>'
  openmpi/1.10/intel-16/1.10.2-1(27):ERROR:102: '''''Tcl command execution failed: <span style="color:red;">prereq intel/16</span>'''''
  openmpi/1.10/intel-16/1.10.2-1(27):ERROR:102: '''''Tcl command execution failed: <span style="color:red;">prereq intel/16</span>'''''
: Colors do not appear in the original terminal output but were added here for clarity. The <span style="color:red;">red item</span> in the last line shows the prerequisite as stated in the text of the module attempted to be loaded. The currently available modules that would satisfy the requirement are listed in <span style="color:blue;">blue</span>. The full name of the offending module that was located from the abbreviated name given in the command appears in <span style="color:brown;">brown</span>.
: Colors do not appear in the original terminal output but were added here for clarity:
* You must load prerequisite modules yourself before modules that depend on them, by full or (recommended) abbreviated name. The typical order is: compiler, MPI, other (dynamic) libraries, and finally your intended module. In the example above, a suitable command would be:
:* The missing prerequisite is the <span style="color:red;">red item</span> on the last line. <!-- as stated in the programming code of the module that you attempted to load -->
  '''module load  intel/16  openmpi/1.10/intel-16'''
:* The modules that would currently satisfy the requirement are shown on the preceding line, indicated here in <span style="color:blue;">blue</span>.
: Note how in this case it is preferable to explicitly supply the compiler name component <code>…/intel-16</code> of the openmpi module to ensure it is consonant with the compiler.
:* The full name of the "offending module", deduced from a possibly abbreviated name on the command line, appears in <span style="color:brown;">brown</span>.
* Different MPI flavors can (and may have to) be loaded at the same time. In this situation, MPI commands will have to be called by absolute path, e.g. <code> $OPENMPI_HOME/bin/mpirun …</code>
* You can inspect the prerequisites of a module in a more succinct manner:
: Again, this is because the system is less homogeneous than in the past: it is impractical or even impracticable to upgrade and maintain applications at a single "one true" MPI implementation.
  $ '''module show openmpi/1.10''' <span style="color:blue;">2>&1</span> '''| grep req'''
<span style="color:red;">prereq intel/16</span>
: The sequence <code><span style="color:blue;">2>&1</span></code> is necessary so the pipe <code>|</code> captures the ''entire'' output of the <code>module show</code> command, i.e., combining its stdout and stderr streams.
<!--
$ '''module show''' vasp5/5.4/impi-5 <font color="red">|&</font> '''grep''' prereq
prereq intel/16
prereq impi/5
-->


== Migration guide ==
=== "module purge" command ===
Or: How to customize your module selection by OS release.
Previously on ''Carbon'' it was difficult to reset the module selection during an interactive terminal session, <!--, or to make and adjust a custom set of module choices. -->
 
because the commands for the job queueing system, like <code>qsub</code>, were provided via a module.
=== Configure in .bashrc ===
You may now safely use the module "purge" command for its intended purpose, as
To switch over to new-style module names ''entirely'', on both CentOS releases, you could continue using only <code>~/.bashrc</code>.
To do this:
* Tell CentOS-5 nodes to offer you the new-style module catalog instead of the old one. To do so, simply create an empty customization file:
<source lang="bash">
<source lang="bash">
touch ~/.modules-2
module purge
</source>
</source>
* Then, apply the changes shown in the [[#Name change rules]] section above:
followed by <code>module load …</code> to choose compilers, MPI flavors, and applications.
 
; Expert Tip – Purge and reload:
To re-load the customizations from your <code>.modules-*</code> files using the module <code>profile</code>:
<source lang="bash">
<source lang="bash">
vi .bashrc
module purge
# or:
module load profile
nano .bashrc
</source>
</source>
: Use a text editor of your choice, such as <code>nano</code> or <code>vi</code>.
: ''Note:'' This does not reload any modules seen in the <code>.bashrc</code> file.
<!--
 
# Create a customization file for new names, but leave it empty (you'll configure your choices in the next step). Use the command: <source lang="bash">touch ~/.modules-2</source>
=== Test your module choices ===
# Edit your <code>.bashrc</code> file and update your module selections as follows:
==== Automated test ====
-->
Use the test built into the migration utility:
modules-migrate '''-t'''
modules-migrate '''--test'''
This will simulate loading your existing <code>.module-*</code> files under the available operating systems.
Review the output. To correct any errors, edit the respective files manually or use the migration utility again:
modules-migrate '''-e'''
modules-migrate '''--edit'''
 
==== Manual test ====
{| class="wikitable" style="float:right; margin-left: 10px;"
! Same node
! EL5 node
! EL6 node
|-
| <code>bash -l</code>
| <code>ssh clogin5</code>
| <code>ssh clogin8</code>
|-
| colspan=3 | <code>module list</code>
|-
| colspan=3 | <code>exit</code>
|-
|}
To test your new module configuration in your actual environment:
# Open another login shell on the current or another node.
# Review error messages that might appear before your prompt.
# Inspect which modules are loaded.
# Edit your <code>.module-*</code> files and address any errors.
# Close the test shell and repeat until your desired modules are loaded without errors.


=== Configure in dedicated files ===
==== Test in a job file ====
It is perhaps cleaner to perform the module selection in files dedicated for each CentOS release,
Your module selection is likely most important in a PBS job file.
so that you can make adjustments independently.
To avoid the hassle of extended wait times for production jobs,
use test jobs with a ''short walltime'' limit and place just diagnostic commands in the job script.


To migrate your existing configuration, manage the following files:
Use the <code>module list</code> and <code>type</code> shell commands to verify that all your modules are loaded
<pre>
and that an application is properly callable without full paths.
.bashrc
; Example:
.modules-1
Consider the following job file <code>modtest.sh</code>:
.modules-2
<source lang="bash">
</pre>
#!/bin/bash
Use a helper application to get you started:
#PBS -l nodes=1:ppn=1
'''modules-migrate'''
#PBS -l walltime=0:1:00
#PBS -N modtest


[[../Migration example]]
module list


<!--
type vasp_gam
=== Differentiate module selection by OS release ===
</source>
If you encounter difficulties with making your module selection work simultaneously for CentOS-5 and CentOS-6, use ''separate'' configurations instead.
Submit the job:
qsub modtest.sh
Alternatively, do the whole thing on the command line, without the need for a separate file:
<source lang="bash">
echo "module list; type vasp5" | qsub -l nodes=1:ppn=1,walltime=0:1:00 -N modtest
</source>
In either case, wait for the job to finish, then inspect the output files:
qstat ''jobnumber''
head modtest.[eo0-9]*
You should see something like:
'''vasp_gam is''' /opt/apps/vasp5/5.4.1.3-6-openmpi-1.10-intel-16/bin/vasp_gam
An error looks like:
-bash: type: vasp_gam: '''not found'''
 
=== Effect on PBS job submissions ===
==== Loading modules in job files ====
* You may now safely load modules in PBS job files when using recent MPI modules, both in the legacy and new schemes. Previously, this was not recommended.
: Recent builds of OpenMPI (1.4 and 1.10) and Intel MPI now have support compiled in to properly start proccesses on remote nodes.
* However, best practice is still to load all modules in dotfiles under your home directory.
: This will always give you the same applications on both login and compute nodes. Place module commands in job files only when conflicts arise, such as when two of your regularly-used applications require different MPI flavors.


# Move all your previous module commands from <code>.bashrc</code> to <code>~/.modules-1</code>, where they will apply only on CentOS-5.
==== Job routing by operation system ====
# Place all your module selections for CentOS-6 in <code>~/.modules-2</code>. Get started using the contents of the .*-1 file.
* TORQUE/PBS jobs that are ''submitted'' from a node running CentOS-5 or CentOS-6 will normally be routed to run only on nodes that run the ''same'' OS release.
-->
* Find the eligible OS in the <code>qstat -f</code> output:
* Test – same as in [[#Configure in dedicated files|previous section]].
$ '''qstat -f ''jobnumber'' | grep opsys'''
    Resource_List.opsys = el5
* You may override the automatic selection prior to submission by adding an <code>opsys</code> job resource:
#PBS -l '''opsys=el5'''
or:
#PBS -l '''opsys=el6'''
* In a pinch, you may even change the OS request of a queued job by using the <code>qalter</code> command, e.g.:
qalter -l '''opsys=el6''' ''jobnumber''


<!--
=== Using multiple MPI flavors ===
== Dependent modules to be user-loaded ==
<!-- Flavors can coexist if commands are called by full path name, but this is bad practice. -->
New-style modules are less implicit (less automatic and less rigid) in loading modules that they depend on.
* Different MPI flavors can, with caution, be loaded at the same time. This may be necessary because the system is less homogeneous than in the past and no longer uses a single "one true" MPI implementation.
* When modules of multiple MPI flavors are loaded, call the appropriate MPI commands by a ''full path'' specified via the <code>''MODULENAME''_HOME</code> environment variables that is set (by ''Carbon'' convention) in the modules.
'''''Example:''''' In a job file that is to run 2 applications that were compiled with different MPI flavors, write:
<source lang="bash">
$OPENMPI_HOME/bin/mpirun app1_name
$IMPI_HOME/bin/mpirun app2_name
</source>


* Less automatic means that prior to loading a more advanced module you must load all its prerequisites, chosen from the same MPI and (usually) compiler flavor as the advanced module. A missing prerequisite will give errors of the form
== Minor changes for the module command ==
… ERROR:151: '''Module''' 'troubled_name' '''depends on one of the module(s)''' 'other_name1 other_name2' …
=== Determining default module versions ===
To resolve this error, edit your <code>~/.bashrc</code> or <code>.modules-1</code> file and add <code>module load …</code> commands for the needed module(s) ''other_names'' before loading "troubled_name".
To determine which module will be loaded when an abbreviated name is used, inspect the first relevant line in the output of one of these commands:
* Less rigid means that a module does not loads a ''specific'' version of a prerequisite, which gives you, the user, more flexibility in combining modules.
module show ''name''
-->
module help ''name''


== Minor changes for the ''module'' command ==
The reason is twofold:
* The <code>module avail</code> command under CentOS-6 no longer issues the marker <code>"(default)"</code> when set for a particular module (which is done administratively using a <code>.version</code> file). I am not sure if this is a bug or by design, but the change makes the output more consistent.
* On the older CentOS-5 system the <code>module</code> command honors <code>.version</code> files ''only for the last component'' of the module. This may lead to different module versions being selected on different systems even when the list of available modules is identical. (Side note: This is a possibly fortuitous bug since openmpi-1.4, used on CentOS-5, sorts after openmpi-1.10.)


=== Name completion in command line ===
=== Name completion on command line ===
When working interactively in a terminal, you can use a "completion" feature of the Bash shell to complete a partially typed module name and show all names available for the name typed so far. For example:
When working interactively in a terminal, you can use the "Tab completion" feature of the Bash shell to complete a partially typed module name and show all names available for the name typed so far.


At a shell prompt (shown as "$"), type:
The feature works as follows. At a shell prompt (shown as "$"), type:
  $ '''module load fft'''
  $ '''module load fft'''
Press the <code><TAB></code> key and the name will be expanded to <code>fftw3/</code> and you'll see two possible completing names, with the cursor waiting at the end of the longest common substring:
Press the <code><TAB></code> key and the name will be expanded to <code>fftw3/3.3/</code>, and you'll see all possible completing names, with the cursor waiting at the end of the longest common substring:
  $ '''module load fftw3/'''_
  $ '''module load fftw3/3.3/'''_
  fftw3/3.3/impi-5/intel-16/3.3.4-10        fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11
  fftw3/3.3/impi-5/intel-16/3.3.4-10        fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11
  fftw3/3.3/intel/3.3.2-1                  fftw3/3.3/openmpi-1.4/intel/3.3.2-4      
  fftw3/3.3/intel/3.3.2-1                  fftw3/3.3/openmpi-1.4/intel/3.3.2-4
Type the letter <code>o</code>, hit  the <code><TAB></code> key again. The choices will be narrowed down to OpenMPI.
Type the letter <code>o</code>, hit  the <code><TAB></code> key again. The choices will be narrowed down to OpenMPI.
  $ '''module load fftw3/3.3/openmpi-1.'''<TAB>
  $ '''module load fftw3/3.3/openmpi-1.'''<TAB>
  fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11  fftw3/3.3/openmpi-1.4/intel/3.3.2-4
  fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11  fftw3/3.3/openmpi-1.4/intel/3.3.2-4
Typing the digit <code>1</code> will pick the <code>1.'''1'''0</code> MPI version, at which the then remaining single module name choice will be completed, with the cursor waiting after an additional space character:
Typing the digit <code>1</code> will pick the <code>1.'''1'''0</code> version, at which point the then remaining single module name choice will be completed all the way, with the cursor waiting after an additional space character:
  $ '''module load fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11''' _
  $ '''module load fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11''' _


=== "module purge" command ===
 
Previously, the Carbon queueing system was made available to users by modules,
= Changes from previous scheme (2008) =
which meant that it was difficult for a user to start with a clean slate, or to make and adjust a custom set of module choices.
== Introduction ==
You may now safely use the module "purge" command for its intended purpose:
On Carbon, the environment modules system has changed in the following aspects,
explained further in this document:
* The naming scheme is [[#Details|more developed and more versatile]].
* The system [[#No modules are pre-loaded|does not preload]] compiler and MPI modules - you must specify all modules yourself.
* [[#No recursive loading|Dependent modules]] are no longer being loaded.
<!-- * modules do not implicitly load other modules they depend on, -->
* The [[#Minor changes for the module command|<code>module</code> command]] behaves in slightly different ways.
 
== Motivation ==
The changes were necessary because of increasing diversity and dependencies of applications, libraries,
and the underlying operating system.
The goal was to accommodate different compilers, MPI flavors, and (in the future)
different aspects of the machine architecture like CPU generation, capabilities, and coprocessor facilities.
 
For different releases of the operating system the new scheme enables  ''existing'' application versions to continue being offered
where possible, and  to make ''new'' application versions available where suitable,
either on both old and newer OS releases, or only on one.
<!-- installation directories and (b) module names to enable addressing and chosing versions by differing …
The naming scheme and usage conventions for environment modules on Carbon are as follows.
-->
 
== Nomenclature ==
Where the previous scheme used a relatively simple name form:
<font color="#888">''name/version-build''</font>
the new scheme includes additional name components like <code>''api''</code>, <code>''mpi''</code>, and <code>''compiler''</code>.
 
== Extent of module catalog ==
* The legacy naming scheme is being retired, along with some of its attendant conventions.
* Newer applications will primarily be compiled and installed on the newer OS release and in the new naming scheme. Some applications may turn out to be backwards-compatible with a previous OS release, and will be made available there as well, in the new scheme, to appropriately offer applications that run on ''both'' or only a ''specific'' release of the operating system.
* Only a subset of modules from the legacy scheme has been carried over into the new scheme, typically the modules representing the most recent version of an application.
 
== Name changes for most modules ==
For most modules the leading name component (the part before any <code>/</code>) will be the same in the previous and new schemes.
What will always differ are the name parts after the first slash, which is relevant if you deliberately (and hopefully with good reason) chose a specific version.
 
; Example:
Here are the names for the FFT3 library module in the legacy and new naming schemes,
as queried by the <code>module avail</code> shell command:
{| class="wikitable" <!-- style="width: 50%;" -->
! style="width: 50%;" | Current scheme
! style="width: 50%;" | Legacy scheme
|-
| style="vertical-align:top;" |
$ '''module -t avail fftw3'''
<span style="color:#888;">/opt/apps/M/x86_64/EL6:</span>
<span style="color:#999;">/opt/apps/M/x86_64/EL:</span>
fftw3/3.3/'''''impi-5'''''/'''''intel-16'''''/3.3.4-10 # uses Intel-MPI
fftw3/3.3/'''''intel'''''/3.3.2-1 # older serial version
fftw3/3.3/'''''openmpi-1.10'''''/'''''intel-16'''''/3.3.4-11 # uses OpenMPI
fftw3/3.3/'''''openmpi-1.4'''''/'''''intel'''''/3.3.2-4 # older MPI version
<span style="color:#777;">/usr/share/Modules/modulefiles:</span>
<span style="color:#777;">/etc/modulefiles:</span>
|
$ '''module -t avail fftw3'''
<span style="color:#888;">/opt/soft/modulefiles:</span>
fftw3/3.2.1-1
fftw3/3.2.2-1
fftw3/3.3-1
fftw3/3.3.2-1
fftw3/3.3.2-4
<span style="color:#777;">/opt/intel/modulefiles:</span>
<span style="color:#777;">/usr/share/Modules/modulefiles:</span>
<span style="color:#777;">/etc/modulefiles:</span>
|-
|}
:* Note the MPI flavor and the compiler name components compared to the legacy naming scheme ('''bold''' is used here for illustration only, your output will appear all as regular text.)
:* The <code>-t</code> option of <code>module avail</code> shows the output in "terse" form, one entry per line.
:* Lines ending in <code>:</code> indicate file system directories in which modules are being located on the current node. <!-- The means by which applications on different OS releases are accommodated is by tailoring the set of module search directories offered to users on a given node. (This is done at the system level through <code>module use ''dirname''</code> statements.) -->
 
== Name change exceptions ==
The names of following modules changed, making their names more consistent:
<source lang="bash">
<source lang="bash">
module purge
legacy scheme new scheme
</source>
-------------------------------------
followed by <code>module load …</code> to choose compilers, MPI flavors, and applications.
asap3 asap/3.x
 
ase2 ase/2 - deprecated
ase3 ase/3 - not needed as separate module, instead, is installed within each of the new "python-env" modules


<!--
g09 gaussian/09
Load the default module set (which varies based on OS release and user choice) as follows:
GaussView gaussview  (lowercase)
<source lang="bash">
module purge
module load defaults/user
</source>


The Intel compilers and OpenMPI continue to be pre-loaded for you.
python python-env - Several suites of Python environments, each with many packages
In other words, without any module customization, you'd see from the <code>module list</code> command:
python.org - The interpreter only, from the main Python web site.
<source lang="bash">
module list
</source>
</source>
Currently Loaded Modulefiles:
<!--
  1) intel/16/16.0.1-2                  2) openmpi/1.10/intel-16/1.10.2-1  3) defaults/user/2/2.0
fftw3 fftw/3.3
vasp vasp/4
vasp5 vasp/5
: Note: Licensees of Vasp-4 only ''must'' specify vasp'''/4'''. The default module for "vasp" is under vasp/5.
-->
-->
: Note that the modules <code>fftw3</code> and <code>vasp5</code> did ''not change name'', given widespread entrenched use of these names in the packages themselves, as Unix group names, and even in Makefiles of other packages.


=== Determining default module versions ===
== Explicit module selections required ==
The <code>module avail</code> command under CentOS-6 no longer includes the marker <code>"(default)"</code> when one has been set in a <code>.version</code> file.
; Compiler and MPI modules are no longer pre-loaded.:
I am not sure if this is a bug or by design, but the change certainly makes the output more consistent.
: Previously, the Intel compilers, the Intel Math Kernel Library, and OpenMPI were loaded even when there were no <code>module load</code> commands in your dot-files.
: Under the new modules scheme, you must yourself load all desired modules in your shell setup files or in job files, in suitable order. Modules not depending on others must be loaded first.
: This is born of necessity because still useful older applications were compiled with older MPI flavors and versions (typically OpenMPI-1.4) which partially interfere with newer flavors (OpenMPI-1.8, 1.10, or Intel-MPI-5.x). In particular, each MPI flavor provides commands like <code>mpirun</code> and <code>mpifort</code>, and special care is needed to run the correct one if your chosen module set spans different MPI flavors.
: While loading all desired modules explicitly may by a minor burden for you at first, your selections should become easier to understand now and easier to adapt later.
 
; No recursive loading:
: A module under the new scheme does not implicitly load other modules that it might depend on, such as modules for compilers, an MPI flavor, or specialized libraries.
: Previously, this was the case for some popular modules but with the system maturing and diversifying, unexpected consequences can occur too easily.


To determine which module will be loaded when an abbreviated name is used, I recommend to inspect the first relevant line in the output of one of these commands:
<!-- See [[HPC/Modules Best Practices#Load dependent modules first]]. -->
module show ''name''
module help ''name''

Latest revision as of 22:29, November 30, 2016

Properties

(*) Environment modules are the means by which software applications are made available to users. Using modules allows users and administrators to pick, by user or even by compute job, a desired or default version of an application from typically several current and historic versions persisting on the system. Older versions are kept to improve reproducibility of results, an important characteristic of scientific computing.

Naming scheme (Nomenclature)

Overview

The full name of a module has two or more components separated by slashes /, in one of the following forms:

name/api/version-build			# binary packages, compilers
name/api/compiler/version-build		# compiled applications
name/api/mpi/compiler/version-build	# compiled applications that use MPI
  • The first component is the applications's main name, usually as chosen by its author.
  • The last component is the version number, also usually chosen by the author, followed by a build identifier chosen on Carbon.
  • Other name components may be present and indicate which set of major tools were used to produce the application locally, which usually implies which modules are required to be loaded to run the application.

Details

  • name is the package's name as chosen on Carbon. It is the name given by the software's author, lowercased for consistency. It may contain numbers if they are customarily part of the name, fftw3 being a prime example.
  • api is the leading part or parts of the package's version number which indicates to suitable precision the API level across which different package versions are expected to be compatible (interchangable in terms of features). The api component typically has one of these forms:
    major
    major.minor
The specificity is subject to an administrator's intuition and understanding of the intentions of the package's author, and may well turn out to be incorrect in the future after unexpected turns in a package's development process (caveat emptor).
  • compiler is a name component that is present when an application was compiled here and thus usually needs runtime libraries associated with the compiler used. The compiler name component is not strictly needed for applications that are statically linked, or come with all their own libraries, but can be present even then for informative purposes. The name component is typically absent for applications installed as binaries, notably commercial applications and, naturally, compilers themselves. The name component typically has sub-components of the form:
    compilerNAME-compilerAPI
  • mpi, present when neeeded, denotes the MPI flavor in use for parallel computations. This name component typically also has sub-components of the form:
    mpiNAME-mpiAPI

Module loading

No pre-loading

No compiler or MPI modules are pre-loaded by the system. The system only loads the "meta" module profile/user for you, which merely looks for and loads your own .module-* files.

No recursive loading

A module in general does not load other modules that it might depend on, such as modules for compilers, an MPI flavor, or specialized libraries.

System-specific command files

For a time, nodes with different operating systems and therefore more or less different module catalogs will coexist in the cluster. Since you will always have the same home directory on each node, most of your script files would have to be written so they run on either operating system. This would mean having to code if statements in your scripts, which can be difficult and fragile to keep up. To simplify conditional module selection, each node on Carbon now looks for specific file names in your home directory. Depending on which files exist, a node:

  1. determines whether to use the legacy or new naming scheme, and then
  2. loads the file that is appropriate for the operating system running on that node, in the scheme determined.

Specifically:

  • The mere existence of files of the form ~/.modules-elx, with x = 5,6,..., activates the new naming scheme. Subsequently, the file will be loaded and interpreed (in Tcl language) on the corresponding OS.
  • On CentOS-5 nodes, a file ~/.modules-el5-legacy will trigger the legacy scheme, but only if ~/.modules-el5 is not present. In other words, a ~/.modules-el5 file has priority and causes ~/.modules-el5-legacy to be ignored.
  • Without any ~/.modules-* files, CentOS-5 nodes will use the legacy scheme; CentOS-6 nodes always use the new scheme.
  • Your .bashrc file will always be read, expecting the naming scheme that was determined by the presence or absence of .modules-* files.
  • When running a PBS job, module commands in the job file will be read, in the same naming scheme as .bashrc.
Tips
  • To load the same modules on any node, in the new scheme:
    • Place your configuration in ~/.modules-el6, then create a symbolic link:
    cd; ln -s .modules-el6 .modules-el5
    • It is possible but not recommended (because less future-proof) to keep your module selection in ~/.bashrc, and activate the new scheme on CentOS-5 nodes by simply creating an empty configuration: touch ~/.modules-el5

Workflow to determine module names to load

(***) Caution: Module names are typically sorted as text strings the same way as Unix ls(1) does it. The resulting order may be counter-intuitive when the same part of several version numbers has a different number of digits. In the following example, version 8.16.x is the most recent release and needed to be administrator-designated as default because character for character, the string 8.7.x would be sorted highest.
$ module avail lumerical
...
lumerical/7.5.7-1
lumerical/8.0.5-1
lumerical/8.15.736-1
lumerical/8.16.931-1(default)
lumerical/8.5.4-1
lumerical/8.7.1-1
...

To determine a suitable module name for a desired package:

  1. On a Carbon command line, list the available flavors and versions, keeping in mind that some older modules were not migrated:
    module avail
    module avail packagename
    (When upgrading from the previous naming scheme, remove version numbers from names of the formpackagename/version, leaving only packagename.)
  2. Use the module show command to inspect details of a module, particularly its full name:
    module show name/api
    The command will use the name you gave to determine a suitable subset of all available modules, pick either a designated default or the highest-sorting version(***) from that subset, and finally show the details for that single version only.
  3. Complete the desired package's name by appending API, MPI, and/or compiler specifications as needed, and repeat the previous step.
  4. Determine a package's dependencies.
    Inspect the output of module show …, look for any prereq statements, and load those on a previous line.

Only give full versions if: you require a build with a specific feature or behavior (such as to reproduce prior results with numeric consistency). To do so:

5. Append version and build specifications, as shown by the module avail packagename command.
Example
  • Let's load a vasp5 module that uses the Intel-MPI flavor ("impi"):
$ module avail vasp5
------------------------------------------------------------ /opt/apps/M/x86_64/EL ------------------------------------------------------------
vasp5/5.3/openmpi-1.4/intel/5.3.2-mkl-beef-1    vasp5/5.4/impi-5/intel-16/5.4.1.3-6
vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-3       vasp5/5.4/openmpi-1.10/intel-16/5.4.1.3-6
vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-cellz-1
  • Let's see which version would be loaded using an abbreviated name:
$ module show vasp5/5.4
-------------------------------------------------------------------
/opt/apps/M/x86_64/EL/vasp5/5.4/openmpi-1.10/intel-16/5.4.1.3-6: 
…
  • Careful: The first output line shows the full file name of the module that would get loaded by the short name. In this case, the abbreviated module name, having no MPI name component, yields a module that uses a different MPI flavor than you want.
  • You will need to be more explicit:
$ module show vasp5/5.4/impi-5
-------------------------------------------------------------------
/opt/apps/M/x86_64/EL/vasp5/5.4/impi-5/intel-16/5.4.1.3-6:

module-whatis	 VASP - Vienna Ab-initio Simulation Package 
conflict	 vasp 
conflict	 vasp-vtst 
prereq	 intel/16
prereq	 impi/5
setenv		 VASP5_HOME /opt/apps/vasp5/5.4.1.3-6-impi-5-intel-16 
prepend-path	 PATH /opt/apps/vasp5/5.4.1.3-6-impi-5-intel-16/bin 
setenv		 VASP_COMMAND vasp-ase 
setenv		 VASP_PP_PATH /opt/soft/vasp-pot/ase 
-------------------------------------------------------------------
  • Therefore, you'd need to add the following lines to your .modules-el6 file:
module load intel/16
module load impi/5
module load vasp5/5.4/impi-5

Best Practices

Use migration utility

Use a helper utility to get started on splitting off and diversifying your existing module selection from .bashrc into .modules-* files:

modules-migrate
The utility will manage the following files:
.bashrc
.modules-el5-legacy
.modules-el6
See an example output of running the utility.
  • Please note that the utility is fairly basic and cannot transform or choose versions and their dependencies as described here.
  • The utility will give you the opportunity to inspect and edit the resulting files. Use a text editor of your choice, such as nano or vi to re-examine or edit these files further.
  • To switch to the new scheme on all nodes, create or copy from .modules-el6:
.modules-el5

Omit detailed versions from module names

When constructing a module load command, try to omit detailed version and build numbers from the end, i.e., load a module that has the full name foo/m.n/compiler/version-build by an abbreviated name foo/m.n/compiler.

Module names that are abbreviated in this manner will be completed at the time of loading to select a default, which is either a version designated as such by an administrator or simply the version with the highest version number. In any case, with abbreviated module names you will benefit from newer modules that have been installed since you last looked. Version numbers are generally chosen by package authors so that packages with the same major version number are binary-compatible.

For instance, instead of:

module load intel/16/16.0.0-3
module load openmpi/1.10/intel-16/1.10.0-4

Write:

module load intel/16
module load openmpi/1.10/intel-16

It is preferable to supply the compiler name part of MPI modules (here …/intel-16) because they usually both (a) need compiler libraries and (b) impliclity use their native compiler for further compilations.

Modules load order

To meet module dependencies, edit your configuration or job files to load required modules first, in this order:

  1. compilers
  2. MPI flavor
  3. other libraries that are dynamically loaded.
  4. your desired application(s).

Understanding dependency errors

Learn to recognize error messages from module load when a required module has not been loaded:

Example: A typical error message will look like:

$ module load openmpi/1.10
openmpi/1.10/intel-16/1.10.2-1(27):ERROR:151: Module 'openmpi/1.10/intel-16/1.10.2-1' depends on one of the module(s) 'intel/16/16.0.2 intel/16/16.0.1-2 intel/16/16.0.0-3 intel/16/16.0.0-1 intel/16/16.0.0-0'
openmpi/1.10/intel-16/1.10.2-1(27):ERROR:102: Tcl command execution failed: prereq intel/16
Colors do not appear in the original terminal output but were added here for clarity:
  • The missing prerequisite is the red item on the last line.
  • The modules that would currently satisfy the requirement are shown on the preceding line, indicated here in blue.
  • The full name of the "offending module", deduced from a possibly abbreviated name on the command line, appears in brown.
  • You can inspect the prerequisites of a module in a more succinct manner:
$ module show openmpi/1.10 2>&1 | grep req
prereq	 intel/16
The sequence 2>&1 is necessary so the pipe | captures the entire output of the module show command, i.e., combining its stdout and stderr streams.

"module purge" command

Previously on Carbon it was difficult to reset the module selection during an interactive terminal session, because the commands for the job queueing system, like qsub, were provided via a module. You may now safely use the module "purge" command for its intended purpose, as

module purge

followed by module load … to choose compilers, MPI flavors, and applications.

Expert Tip – Purge and reload

To re-load the customizations from your .modules-* files using the module profile:

module purge
module load profile
Note: This does not reload any modules seen in the .bashrc file.

Test your module choices

Automated test

Use the test built into the migration utility:

modules-migrate -t
modules-migrate --test

This will simulate loading your existing .module-* files under the available operating systems. Review the output. To correct any errors, edit the respective files manually or use the migration utility again:

modules-migrate -e
modules-migrate --edit

Manual test

Same node EL5 node EL6 node
bash -l ssh clogin5 ssh clogin8
module list
exit

To test your new module configuration in your actual environment:

  1. Open another login shell on the current or another node.
  2. Review error messages that might appear before your prompt.
  3. Inspect which modules are loaded.
  4. Edit your .module-* files and address any errors.
  5. Close the test shell and repeat until your desired modules are loaded without errors.

Test in a job file

Your module selection is likely most important in a PBS job file. To avoid the hassle of extended wait times for production jobs, use test jobs with a short walltime limit and place just diagnostic commands in the job script.

Use the module list and type shell commands to verify that all your modules are loaded and that an application is properly callable without full paths.

Example

Consider the following job file modtest.sh:

#!/bin/bash
#PBS -l nodes=1:ppn=1
#PBS -l walltime=0:1:00
#PBS -N modtest

module list

type vasp_gam

Submit the job:

qsub modtest.sh

Alternatively, do the whole thing on the command line, without the need for a separate file:

echo "module list; type vasp5" | qsub -l nodes=1:ppn=1,walltime=0:1:00 -N modtest

In either case, wait for the job to finish, then inspect the output files:

qstat jobnumber
…
head modtest.[eo0-9]*

You should see something like:

vasp_gam is /opt/apps/vasp5/5.4.1.3-6-openmpi-1.10-intel-16/bin/vasp_gam

An error looks like:

-bash: type: vasp_gam: not found

Effect on PBS job submissions

Loading modules in job files

  • You may now safely load modules in PBS job files when using recent MPI modules, both in the legacy and new schemes. Previously, this was not recommended.
Recent builds of OpenMPI (1.4 and 1.10) and Intel MPI now have support compiled in to properly start proccesses on remote nodes.
  • However, best practice is still to load all modules in dotfiles under your home directory.
This will always give you the same applications on both login and compute nodes. Place module commands in job files only when conflicts arise, such as when two of your regularly-used applications require different MPI flavors.

Job routing by operation system

  • TORQUE/PBS jobs that are submitted from a node running CentOS-5 or CentOS-6 will normally be routed to run only on nodes that run the same OS release.
  • Find the eligible OS in the qstat -f output:
$ qstat -f jobnumber | grep opsys
   Resource_List.opsys = el5
  • You may override the automatic selection prior to submission by adding an opsys job resource:
#PBS -l opsys=el5

or:

#PBS -l opsys=el6
  • In a pinch, you may even change the OS request of a queued job by using the qalter command, e.g.:
qalter -l opsys=el6 jobnumber

Using multiple MPI flavors

  • Different MPI flavors can, with caution, be loaded at the same time. This may be necessary because the system is less homogeneous than in the past and no longer uses a single "one true" MPI implementation.
  • When modules of multiple MPI flavors are loaded, call the appropriate MPI commands by a full path specified via the MODULENAME_HOME environment variables that is set (by Carbon convention) in the modules.

Example: In a job file that is to run 2 applications that were compiled with different MPI flavors, write:

$OPENMPI_HOME/bin/mpirun app1_name
$IMPI_HOME/bin/mpirun app2_name

Minor changes for the module command

Determining default module versions

To determine which module will be loaded when an abbreviated name is used, inspect the first relevant line in the output of one of these commands:

module show name
module help name

The reason is twofold:

  • The module avail command under CentOS-6 no longer issues the marker "(default)" when set for a particular module (which is done administratively using a .version file). I am not sure if this is a bug or by design, but the change makes the output more consistent.
  • On the older CentOS-5 system the module command honors .version files only for the last component of the module. This may lead to different module versions being selected on different systems even when the list of available modules is identical. (Side note: This is a possibly fortuitous bug since openmpi-1.4, used on CentOS-5, sorts after openmpi-1.10.)

Name completion on command line

When working interactively in a terminal, you can use the "Tab completion" feature of the Bash shell to complete a partially typed module name and show all names available for the name typed so far.

The feature works as follows. At a shell prompt (shown as "$"), type:

$ module load fft

Press the <TAB> key and the name will be expanded to fftw3/3.3/, and you'll see all possible completing names, with the cursor waiting at the end of the longest common substring:

$ module load fftw3/3.3/_
fftw3/3.3/impi-5/intel-16/3.3.4-10        fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11
fftw3/3.3/intel/3.3.2-1                   fftw3/3.3/openmpi-1.4/intel/3.3.2-4

Type the letter o, hit the <TAB> key again. The choices will be narrowed down to OpenMPI.

$ module load fftw3/3.3/openmpi-1.<TAB>
fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11  fftw3/3.3/openmpi-1.4/intel/3.3.2-4

Typing the digit 1 will pick the 1.10 version, at which point the then remaining single module name choice will be completed all the way, with the cursor waiting after an additional space character:

$ module load fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11 _


Changes from previous scheme (2008)

Introduction

On Carbon, the environment modules system has changed in the following aspects, explained further in this document:

Motivation

The changes were necessary because of increasing diversity and dependencies of applications, libraries, and the underlying operating system. The goal was to accommodate different compilers, MPI flavors, and (in the future) different aspects of the machine architecture like CPU generation, capabilities, and coprocessor facilities.

For different releases of the operating system the new scheme enables existing application versions to continue being offered where possible, and to make new application versions available where suitable, either on both old and newer OS releases, or only on one.

Nomenclature

Where the previous scheme used a relatively simple name form:

name/version-build

the new scheme includes additional name components like api, mpi, and compiler.

Extent of module catalog

  • The legacy naming scheme is being retired, along with some of its attendant conventions.
  • Newer applications will primarily be compiled and installed on the newer OS release and in the new naming scheme. Some applications may turn out to be backwards-compatible with a previous OS release, and will be made available there as well, in the new scheme, to appropriately offer applications that run on both or only a specific release of the operating system.
  • Only a subset of modules from the legacy scheme has been carried over into the new scheme, typically the modules representing the most recent version of an application.

Name changes for most modules

For most modules the leading name component (the part before any /) will be the same in the previous and new schemes. What will always differ are the name parts after the first slash, which is relevant if you deliberately (and hopefully with good reason) chose a specific version.

Example

Here are the names for the FFT3 library module in the legacy and new naming schemes, as queried by the module avail shell command:

Current scheme Legacy scheme
$ module -t avail fftw3
/opt/apps/M/x86_64/EL6:
/opt/apps/M/x86_64/EL:
fftw3/3.3/impi-5/intel-16/3.3.4-10	 # uses Intel-MPI
fftw3/3.3/intel/3.3.2-1			 # older serial version
fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11 # uses OpenMPI
fftw3/3.3/openmpi-1.4/intel/3.3.2-4	 # older MPI version
/usr/share/Modules/modulefiles:
/etc/modulefiles:
$ module -t avail fftw3
/opt/soft/modulefiles:
fftw3/3.2.1-1
fftw3/3.2.2-1
fftw3/3.3-1
fftw3/3.3.2-1
fftw3/3.3.2-4
/opt/intel/modulefiles:
/usr/share/Modules/modulefiles:
/etc/modulefiles:
  • Note the MPI flavor and the compiler name components compared to the legacy naming scheme (bold is used here for illustration only, your output will appear all as regular text.)
  • The -t option of module avail shows the output in "terse" form, one entry per line.
  • Lines ending in : indicate file system directories in which modules are being located on the current node.

Name change exceptions

The names of following modules changed, making their names more consistent:

legacy scheme	new scheme
-------------------------------------
asap3		asap/3.x		

ase2		ase/2		- deprecated
ase3		ase/3		- not needed as separate module, instead, is installed within each of the new "python-env" modules

g09		gaussian/09
GaussView	gaussview  (lowercase)

python		python-env	- Several suites of Python environments, each with many packages
		python.org	- The interpreter only, from the main Python web site.
Note that the modules fftw3 and vasp5 did not change name, given widespread entrenched use of these names in the packages themselves, as Unix group names, and even in Makefiles of other packages.

Explicit module selections required

Compiler and MPI modules are no longer pre-loaded.
Previously, the Intel compilers, the Intel Math Kernel Library, and OpenMPI were loaded even when there were no module load commands in your dot-files.
Under the new modules scheme, you must yourself load all desired modules in your shell setup files or in job files, in suitable order. Modules not depending on others must be loaded first.
This is born of necessity because still useful older applications were compiled with older MPI flavors and versions (typically OpenMPI-1.4) which partially interfere with newer flavors (OpenMPI-1.8, 1.10, or Intel-MPI-5.x). In particular, each MPI flavor provides commands like mpirun and mpifort, and special care is needed to run the correct one if your chosen module set spans different MPI flavors.
While loading all desired modules explicitly may by a minor burden for you at first, your selections should become easier to understand now and easier to adapt later.
No recursive loading
A module under the new scheme does not implicitly load other modules that it might depend on, such as modules for compilers, an MPI flavor, or specialized libraries.
Previously, this was the case for some popular modules but with the system maturing and diversifying, unexpected consequences can occur too easily.