HPC/Module naming scheme 2016: Difference between revisions

From CNM Wiki
< HPC
Jump to navigation Jump to search
mNo edit summary
Line 3: Line 3:
| (*) ''Environment modules'' are the means by which software applications are made available to users. Using modules allows users and administrators to pick, by user or even by compute job, a desired or default version of an application from typically several current and historic versions persisting on the system. Older versions are kept to improve reproducibility of results, an important characteristic of scientific computing.
| (*) ''Environment modules'' are the means by which software applications are made available to users. Using modules allows users and administrators to pick, by user or even by compute job, a desired or default version of an application from typically several current and historic versions persisting on the system. Older versions are kept to improve reproducibility of results, an important characteristic of scientific computing.
|}
|}
On Carbon, the environment modules* system has changed in the following aspects, all explained further in this document:
On Carbon, the environment modules* system has changed in the following aspects,
explained further in this document:
* The naming scheme is more developed and more versatile.
* The naming scheme is more developed and more versatile.
* Default and dependent modules are no longer being loaded.
* Default and dependent modules are no longer being loaded.
Line 10: Line 11:


== Motivation ==
== Motivation ==
The changes were necessary because of increasing diversity and interdependence of applications, their modules, and the underlying operating system.
The changes were necessary because of increasing diversity and dependencies of applications, libraries,
The goal was to accommodate different compilers, MPI flavors, and (in the future) different aspects of the machine architecture like CPU generation, capabilities, and coprocessor facilities.
and the underlying operating system.
The goal was to accommodate different compilers, MPI flavors, and (in the future)
different aspects of the machine architecture like CPU generation, capabilities, and coprocessor facilities.


Further, for different OS releases the new scheme enables  ''existing'' application versions to continue being offered where possible, and  to make ''new'' application versions available where suitable,
For different OS releases the new scheme enables  ''existing'' application versions to continue being offered
where possible, and  to make ''new'' application versions available where suitable,
either on both old and newer OS releases, or only on one.
either on both old and newer OS releases, or only on one.
<!-- installation directories and (b) module names to enable addressing and chosing versions by differing …
<!-- installation directories and (b) module names to enable addressing and chosing versions by differing …
Line 20: Line 24:


== Naming scheme (Nomenclature) ==
== Naming scheme (Nomenclature) ==
The general module naming scheme is as follows:
=== Overview ===
* The full name of a module has several components that are separated by a slash <code>/</code>.
The full name of a module has two or more components separated by slashes <code>/</code>, for instance:
* The first and last component of the module name, respectively, are formed by the applications's typically author-provided main name and version, along with a build number or identifier that is local to ''Carbon''.
* Other name components may be present in-between and make apparent to the user which set of major tools was used to produce the application locally, which usually translates to which modules must be loaded to ''run'' the application.
 
In detail, module names have one of the following forms and typical use cases:
  ''name/api/version-build'' # binary packages, compilers
  ''name/api/version-build'' # binary packages, compilers
  ''name/api/compiler/version-build'' # compiled applications
  ''name/api/compiler/version-build'' # compiled applications
  ''name/api/mpi/compiler/version-build'' # compiled applications that use MPI
  ''name/api/mpi/compiler/version-build'' # compiled applications that use MPI
* <code>''name''</code> is the package's name as chosen on ''Carbon''. It is usually the name given by the software's author, but lowercased for consistency, and it may contain a number if conventionally so named by the author, e.g. <code>fftw3</code>.
* The first component is the applications's ''main name'', usually as chosen by its author.
* The last component is the also usually author-provided ''version number'', followed by a ''build identifier'' chosen on ''Carbon''.
* Other name components may be present and indicate which set of major tools were used to produce the application locally, which usually implies which modules are required to be loaded to run the application.
 
=== Details ===
* <code>''name''</code> is the package's name as chosen on ''Carbon''. It is the name given by the software's author, but lowercased for consistency. It may numbers if customarily used in that way, e.g. <code>fftw3</code>.
* <code>''api''</code> is the leading part or parts of the package's version number which typically signifies to suitable precision the [https://en.m.wikipedia.org/wiki/Application_programming_interface API] level across which different package versions are expected to be compatible (interchangable in terms of features). <code>''api''</code> is typically a sole ''major'' version number, or has the form ''major.minor''. You may load a module that has the full name <code>foo/m.n/compiler/version-build</code> by the abbreviated name <code>foo/m.n/compiler</code>, which enables you to select the features and binary compatibility level that you need without having to give a complete name all the way down to a build number.
* <code>''api''</code> is the leading part or parts of the package's version number which typically signifies to suitable precision the [https://en.m.wikipedia.org/wiki/Application_programming_interface API] level across which different package versions are expected to be compatible (interchangable in terms of features). <code>''api''</code> is typically a sole ''major'' version number, or has the form ''major.minor''. You may load a module that has the full name <code>foo/m.n/compiler/version-build</code> by the abbreviated name <code>foo/m.n/compiler</code>, which enables you to select the features and binary compatibility level that you need without having to give a complete name all the way down to a build number.
* <code>''compiler''</code> is a name component that is present when an application was ''compiled'' here and thus usually needs runtime libraries associated with the compiler used. The <code>''compiler''</code> name component is not strictly needed for applications that are ''statically linked'', but is usually present even then for informative purposes. Conversely, the name component is typically ''not present'' for applications installed from binary distribution packages, notably commercial applications and, naturally, compilers themselves.
* <code>''compiler''</code> is a name component that is present when an application was ''compiled'' here and thus usually needs runtime libraries associated with the compiler used. The <code>''compiler''</code> name component is not strictly needed for applications that are ''statically linked'', but is usually present even then for informative purposes. Conversely, the name component is typically ''not present'' for applications installed from binary distribution packages, notably commercial applications and, naturally, compilers themselves.
Line 36: Line 41:
  ''name/version-build''
  ''name/version-build''


'''''Example:''''' Names for the FFT3 library module in the old and new naming schemes, queried by the <code>module avail</code> shell command:
=== Name changes ===
==== Name changes for most modules ====
For most modules the leading name component (the part before any "/") will be the same in schemes 1 and 2.
What will always differ are the name parts after the first slash, which is relevant if you deliberately (and hopefully with good reason) chose a specific version.
<!-- for <code>module load ''name''</code> commands  -->
* To select the latest or administratively designated default version of a package:
*# On a Carbon command line, list the available flavors and versions, keeping in mind that some older modules were not migrated:
*#: <code>module avail ''packagename''</code>
*# In your configuration file, remove version numbers from old names of the form<code>''packagename/version''</code>, leaving only <code>''packagename''</code>.
*# Append API, MPI, and/or compiler specifications as needed.
*: This is the recommended approach, as you will automatically benefit from future updates and maintenance builds.
*: For instance, instead of <code>vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-3</code> write <code>vasp5/5.3/openmpi-1.4</code> <!-- or <code>vasp5/5.3</code>, letting the system pick the versions for MPI and compiler that are set as defaults at that time. -->
* To insist on a specific version and build for a package in new-scheme names, which you should do only if you require a build with a specific feature or behavior:
::4. Append version and build specifications, as shown by the <code>module avail ''packagename''</code> command above.
<!--
** Choose the new-scheme name up to the desired specificity. You may leave out trailing name or directory parts.
-->
 
==== Name change exceptions ====
The names of following modules changed from scheme 1 to 2, making their names more consistent:
<source lang="bash">
scheme 1 scheme 2
-------------------------------------
asap3 asap/3.x
 
ase2 ase/2 - deprecated
ase3 ase/3 - not needed as separate module, instead, is installed within each of the new "python-env" modules
 
g09 gaussian/09
GaussView gaussview  (lowercase)
 
python python-env - Several suites of Python environments, each with many packages
python.org - The interpreter only, from the main Python web site.
</source>
<!--
fftw3 fftw/3.3
vasp vasp/4
vasp5 vasp/5
: Note: Licensees of Vasp-4 only ''must'' specify vasp'''/4'''. The default module for "vasp" is under vasp/5.
-->
: Note that the modules <code>fftw3</code> and <code>vasp5</code> did ''not change name'', given widespread entrenched use of these names in the packages themselves, as Unix group names, and even in Makefiles of other packages.
 
=== Example ===
Here are the names for the FFT3 library module in the old and new naming schemes,
as queried by the <code>module avail</code> shell command:
{| class="wikitable" <!-- style="width: 50%;" -->
{| class="wikitable" <!-- style="width: 50%;" -->
! style="width: 50%;" | Scheme 2 (current)
! style="width: 50%;" | Scheme 2 (current)
Line 64: Line 113:
|-
|-
|}
|}
:* Note the MPI flavor and the compiler name components compared to the older naming scheme. (Boldface shown here for illustration only, your output will appear all as regular text.)
:* Note the MPI flavor and the compiler name components compared to the older naming scheme ('''bold''' is used here for illustration only, your output will appear all as regular text.)
:* The <code>-t</code> option of <code>module avail</code> shows the output in "terse" form, one entry per line.
:* The <code>-t</code> option of <code>module avail</code> shows the output in "terse" form, one entry per line.
:* Lines ending in <code>:</code> indicate file system directories in which modules are being located on the current node. <!-- The means by which applications on different OS releases are accommodated is by tailoring the set of module search directories offered to users on a given node. <!-- (This is done at the system level through <code>module use ''dirname''</code> statements.) -->
:* Lines ending in <code>:</code> indicate file system directories in which modules are being located on the current node. <!-- The means by which applications on different OS releases are accommodated is by tailoring the set of module search directories offered to users on a given node. (This is done at the system level through <code>module use ''dirname''</code> statements.) -->


== Watch out: Changes requiring your attention ==
<!-- == Changes requiring your attention ==
The nitty-gritty [[HPC/Module changes 2016 – Details|'''details of the changes''']] are listed separately.
The nitty-gritty [[HPC/Module changes 2016 – Details|'''details of the changes''']] are listed separately.
-->
== Module selection by operating system ==
<!-- How to customize your module selection by OS release)
Previously, modules were loaded from the <code>~/.bashrc</code> file and, with some caveats, from PBS job files.
With different OS releases active in the cluster, it is now recommended to place module commands into OS-specific files.
-->
For a time, nodes with different operating systems and therefore more or less
''different module catalogs'' will coexist in the cluster.
Since you will always have the same home directory on each node, most of your
script files would have to be written so they run on either operating system.
This usually means having to code <code>if</code> statements in your scripts,
which can be difficult.
To simplify conditional module selection, each node on ''Carbon'' now looks for
''specific file names'' in your home directory and selectively loads only the
file that is appropriate for the operating system running on that node.
* Files of the form <code>~/.modules-el''x''</code>, with ''x'' = 5,6,..., will be loaded on the corresponding OS, under naming scheme 2.
* On CentOS-5 nodes, a file <code>~/.modules-el5-legacy</code> will be used under scheme 1, but only if <code>~/.modules-el5</code> is not present. In other words, a <code>~/.modules-el5</code> file has priority and causes <code>~/.modules-el5-legacy</code> to be ignored.
* Without any <code>~/.modules-*</code> files, CentOS-5 nodes will use scheme 1; CentOS-6 nodes always use scheme 2.
* Your <code>.bashrc</code> will always be read, with the modules naming scheme that was determined by the presence or absence of <code>.modules-*</code> files.
* When running a PBS job, module commands in the job file will be read, in the same naming scheme as .bashrc.


== Migration guide – How to customize your module selection by OS release ==
<!--
When adapting your existing module commands, you could continue using only the <code>~/.bashrc</code> file which customarily held these commands.
Place your desired  <code>module load</code> commands in ''specific files'', as follows:
However, is is recommended to break out your module selection into files that are specific to OS release and module naming scheme.
Later on, having separate files will make the scope of changes more obvious.


* To migrate your existing configuration, use a helper utility to get you started:
In summary, the files will be detected and read as follows:
: {| class="wikitable"
! rowspan=2 colspan=2 | You have files:<br/>.bashrc and …
! style="color:#888;" rowspan=2 | Remark
! colspan=2 | CentOS-5 reads:
! colspan=2 | CentOS-6 reads:
! colspan=2 | CentOS-7 reads:
|-
! files
! module names
! files
! module names
! files
! module names
|-
| – || – || style="color:#888; text-align: left;" | Starting situation || style="background:#ffd;" | only .bashrc || style="background:#ffd;" | scheme 1 || style="background:#ddf;" | only .bashrc || style="background:#ddf;" rowspan=3 | scheme 2 || style="color:#888; background:#ddd; text-align:center;" rowspan=3 colspan=2 | To be determined.
|-
| '''.modules-el5-legacy''' || '''.modules-el6''' || style="color:#888; text-align: left;" | Compatibility scheme || style="background:#ffd;" | .modules-el5-legacy and .bashrc || style="background:#ffd;" | scheme '''1''' || style="background:#ddf;" | .modules-el6 and .bashrc
|-
| '''.modules-el5''' || '''.modules-el6''' || style="color:#888; text-align: left;" | Full switch-over, ''recommended'' || style="background:#ddf;" | .modules-'''el5''' and .bashrc || style="background:#ddf;" | scheme '''2''' || style="background:#ddf;" | .modules-el6 and .bashrc
|-
|}
-->
; Tips:
* To load the ''same'' modules on any node, in scheme 2:
** Place your configuration in <code>~/.modules-el6</code>, then create a symbolic link:
*: <code>cd; ln -s .modules-el6  .modules-el5</code>
** It is possible but not recommended (because less future-proof) to keep your module selection in <code>~/.bashrc</code>, and activate scheme 2 on CentOS-5 nodes by simply creating an empty configuration: <code>touch ~/.modules-el5</code>
<!--
* '''Caution:''' Avoid the following file constellations because they can easily become confusing:
:* Only one of these files present: <code>.modules-el5-legacy</code>, or <code>.modules-el5</code>, or <code>.modules-el6</code>.
:* Both <code>.modules-el5-legacy</code> and <code>.modules-el5</code> present. While helpful for transitioning, remember that the former file will be ignored as soon as the latter exists.
: You may get errors on nodes that do not read these files, or you might find that you need conditional logic in your .bashrc file.  Recall that your home directory is the same across nodes, and therefore your configuration scripts are sensitive to differences between OS releases.
-->
 
== Best Practices ==
=== Migration utility ===
To migrate your existing module selection from .bashrc to .modules-* files, use a helper utility to get you started:
  '''modules-migrate'''
  '''modules-migrate'''
: The utility will manage the following files but ''does not'' change any module names – you must afterwards inspect and edit these files:
: The utility will manage the following files but ''does not'' change module names:
<pre>
.bashrc
.bashrc
.modules-el5-legacy
.modules-1
.modules-el6
.modules-el6
: See an [[Sandbox/Migration example|'''example output''']] of running the utility.
</pre>
* The utility will give you the opportunity to inspect and edit the resulting files. Use a text editor of your choice, such as <code>nano</code> or <code>vi</code> to re-examine or edit these files further.
: See an [[Sandbox/Migration example|'''example output''']] of running the helper utility.
* To switch to scheme 2 on all nodes, create or copy from <code>.modules-el6</code>:
* To use scheme 2 on CentOS-5 machines, manually copy or create:
.modules-el5
<pre>
.modules-el5
</pre>
* Use a text editor of your choice, such as <code>nano</code> or <code>vi</code> to re-examine and edit these files.
* Test your module choices as shown in the next section.
* Test your module choices as shown in the next section.


== Test your module choices ==
=== Use suitably abbreviated names ===
=== Interactive test ===
Omit detailed version numbers and build numbers from the end of module names.
: This will select the most up-to-date module version at the time of loading. You will benefit from newer modules that have been installed since you last looked. Version numbers are generally chosen so that versions with the same major version number are binary-compatible.
 
For instance, instead of:
module load intel/16/16.0.0-3
module load openmpi/1.10/intel-16/1.10.0-4
Write:
module load intel'''/16'''
module load openmpi'''/1.10/intel-16'''
 
It is preferable to supply the compiler name part of MPI modules (here <code>…/intel-16</code>) because they usually both (a) need compiler libraries and (b) impliclity use their native compiler for further compilations.
 
=== Test your interactive shell (login environment) ===
{| class="wikitable" style="float:right; margin-left: 10px;"
{| class="wikitable" style="float:right; margin-left: 10px;"
! Same node
! Same node
Line 101: Line 215:
| <code>bash -l</code>
| <code>bash -l</code>
| <code>ssh clogin5</code>
| <code>ssh clogin5</code>
| <code>ssh clogin7</code>
| <code>ssh clogin8</code>
|-
|-
| colspan=3 | <code>module list</code>
| colspan=3 | <code>module list</code>
Line 112: Line 226:
# Review error messages that might appear before your prompt.
# Review error messages that might appear before your prompt.
# Inspect which modules are loaded.
# Inspect which modules are loaded.
# Edit your .module-* files to mitigate any errors.
# Edit your <code>.module-*</code> files and address any errors.
# Close the testing shell and repeat until your desired modules are loaded without errors.
# Close the test shell and repeat until your desired modules are loaded without errors.


=== Test in a job file ===
=== Test in a job file ===
Your module selection is likely most important in a PBS job file. Avoid the hassle of extended wait times for production jobs by using test jobs with a ''short walltime'' limit and job bodies with merely diagnostic commands. Use the <code>module list</code> and <code>type</code> shell commands to verify that all your modules are loaded and to determine if an application is properly callable without full paths.
Your module selection is likely most important in a PBS job file.
To avoid the hassle of extended wait times for production jobs,
use test jobs with a ''short walltime'' limit and place just diagnostic commands in the job script.
 
Use the <code>module list</code> and <code>type</code> shell commands to verify that all your modules are loaded
and that an application is properly callable without full paths.
; Example:
; Example:
Consider the following job file <code>modtest.sh</code>:
Consider the following job file <code>modtest.sh</code>:
Line 130: Line 249:
</source>
</source>
Submit the job:
Submit the job:
  qsub modtest
  qsub modtest.sh
Alternatively, do the whole thing on the command line, without the need for a separate file:
Alternatively, do the whole thing on the command line, without the need for a separate file:
<source lang="bash">
<source lang="bash">
Line 143: Line 262:
An error looks like:
An error looks like:
  -bash: type: vasp_gam: '''not found'''
  -bash: type: vasp_gam: '''not found'''
<!-- == Module changes requiring your attention ==
You may need to adapt the modules that you load in your shell startup and job files to the new naming scheme.
Follow the [[Sandbox/Module changes#Migration guide – How to customize your module selection by OS release|'''Migration guide''']] to implement the changes for your files.
-->
=== Modules load order ===
* To resolve module dependencies, edit your configuration or job files to load required modules first, in this order:
*# compilers
*# MPI flavor
*# other libraries that are dynamically loaded.
*# your desired application(s).
=== Understanding dependency errors ===
Learn to recognize error messages from <code>module load</code> when a required module has not been loaded:
'''''Example:''''' A typical error message will look like:
$ '''module load openmpi/1.10'''
openmpi/1.10/intel-16/1.10.2-1(27):ERROR:151: Module '<span style="color:brown;">openmpi/1.10/intel-16/1.10.2-1</span>' '''''depends on one of the module(s)''''' '<span style="color:blue;">intel/16/16.0.2 intel/16/16.0.1-2 intel/16/16.0.0-3 intel/16/16.0.0-1 intel/16/16.0.0-0</span>'
openmpi/1.10/intel-16/1.10.2-1(27):ERROR:102: '''''Tcl command execution failed: <span style="color:red;">prereq intel/16</span>'''''
: Colors do not appear in the original terminal output but were added here for clarity:
:* The missing prerequisite is the <span style="color:red;">red item</span> on the last line. <!-- as stated in the programming code of the module that you attempted to load -->
:* The modules that would currently satisfy the requirement are shown on the preceding line, indicated here in <span style="color:blue;">blue</span>.
:* The full name of the "offending module", deduced from a possibly abbreviated name on the command line, appears in <span style="color:brown;">brown</span>.
* You can inspect the prerequisites of a module in a more succinct manner:
$ '''module show openmpi/1.10''' <span style="color:blue;">2>&1</span> '''| grep req'''
<span style="color:red;">prereq intel/16</span>
: The sequence <code><span style="color:blue;">2>&1</span></code> is necessary so the pipe <code>|</code> captures the ''entire'' output of the <code>module show</code> command.
=== Effect on PBS job submissions ===
==== Loading modules in job files ====
* You may now safely load modules in PBS job files when using recent MPI modules, both in scheme 1 and scheme 2. Previously, this was not recommended.
: Recent builds of OpenMPI (1.4 and 1.10) and Intel MPI now have support compiled in to properly start proccesses on remote nodes.
* However, best practice is still to load all modules in dotfiles under your home directory.
: This will always give you the same applications on both login and compute nodes. Place module commands in job files only when conflicts arise, such as when two of your regularly-used applications require different MPI flavors.
==== Job routing by operation system ====
* TORQUE/PBS jobs that are ''submitted'' from a node running CentOS-5 or CentOS-6 will normally be routed to run only on nodes that run the ''same'' OS release.
* Find the eligible OS in the <code>qstat -f</code> output:
$ '''qstat -f ''jobnumber'' | grep opsys'''
    Resource_List.opsys = el5
* You may override the automatic selection prior to submission by adding an <code>opsys</code> job resource:
#PBS -l '''opsys=el5'''
or:
#PBS -l '''opsys=el6'''
* In a pinch, you may even change the OS request of a queued job by using the <code>qalter</code> command, e.g.:
qalter -l '''opsys=el6''' ''jobnumber''
=== Using multiple MPI flavors ===
<!-- Flavors can coexist if commands are called by full path name, but this is bad practice. -->
* Different MPI flavors can, with caution, be loaded at the same time. This may be necessary because the system is less homogeneous than in the past and no longer uses a single "one true" MPI implementation.
* When modules of multiple MPI flavors are loaded, call the appropriate MPI commands by a ''full path'' specified via the <code>''MODULENAME''_HOME</code> environment variables that is set by the modules.
'''''Example:''''' In a job file that is to run 2 applications that were compiled with different MPI flavors, write:
<source lang="bash">
$OPENMPI_HOME/bin/mpirun app1_name
$IMPI_HOME/bin/mpirun app2_name
</source>
== Changes requiring attention ==
=== Available modules ''differ between naming schemes'' ===
* The previous module naming scheme 1 is being retired, along with some of its attendant conventions.
* Newer applications will primarily be compiled and installed on the newer OS release and in naming scheme 2. Some applications may turn out to be backwards-compatible with a previous OS release, and will be made available there as well, in scheme 2, to appropriately offer applications that run on ''both'' or only a ''specific'' release of the operating system.
* Only a subset of modules from scheme 1 has been ported to scheme 2, typically the modules representing the most recent version of an application.
=== Load all modules yourself ===
Under scheme 2, you must yourself load all desired modules, particularly '''compiler''' and '''MPI modules''', in your shell setup files or in job files (see section below), in suitable order.
Modules not depending on others must be loaded first.
: This is born of necessity because still useful older applications were compiled with older MPI flavors and versions (typically OpenMPI-1.4) which partially interfere with newer flavors (OpenMPI-1.8, 1.10, or Intel-MPI-5.x). In particular, each MPI flavor provides commands like <code>mpirun</code> and <code>mpifort</code>, and special care is needed to run the correct one if your chosen module set spans different MPI flavors.
==== No modules are pre-loaded ====
{| class="wikitable" style="width: 20%; float: right"
| (**) Technically, the system ''does load'' the module <code>profile/user</code> for you. This module only contains the instructions to select and read the appropriate <code>.modules-*</code> file.
|}
* No modules are pre-loaded by the system**.
: Previously, the Intel compilers, the Intel Math Kernel Library, and OpenMPI were pre-loaded for you.
==== No recursive loading ====
A module under scheme 2 does not implicitly load other modules that it might
depend on, such as modules for compilers, an MPI flavor, or specialized
libraries.
Previously, this was the case for some popular modules but with the system
maturing and diversifying, unexpected consequences can occur easily.
You must load dependent modules yourself. While this may by a minor burden for you at first, your selections should become easier to understand now and easier to adapt later.
See [[HPC/Modules Best Practices#Load dependent modules first]].
=== Determining prerequisites and load order ===
To see if an particular application module (such as VASP or Quantum-ESPRESSO) has any prerequisites,
inspect the output of <code>module show ''name''</code>, and look for any <code>prereq</code> statements.
Then edit your <code>.modules-*</code> file, load the <code>prereq</code> modules first, followed by the desired application module.
'''''Example:'''''
:* Let's load a vasp5 module that uses the Intel-MPI flavor, named "impi":
$ '''module avail vasp5'''
------------------------------------------------------------ /opt/apps/M/x86_64/EL ------------------------------------------------------------
vasp5/5.3/openmpi-1.4/intel/5.3.2-mkl-beef-1    '''''vasp5/5.4/impi-5/intel-16/5.4.1.3-6'''''
vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-3      vasp5/5.4/openmpi-1.10/intel-16/5.4.1.3-6
vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-cellz-1
:* Let's try inspecting what it needs:
$ '''module show vasp5/5.4'''
-------------------------------------------------------------------
/opt/apps/M/x86_64/EL/vasp5/5.4/'''''openmpi-1.10'''''/intel-16/5.4.1.3-6:
:* Careful: The first output line shows the full file name of the module that would get loaded by the short name. In this case, the abbreviated module name, having no MPI name component, yields a module that uses a different MPI flavor than you want.
:* You will need to be more explicit:
$ '''module show vasp5/5.4/impi-5'''
-------------------------------------------------------------------
/opt/apps/M/x86_64/EL/'''''vasp5/5.4/impi-5/intel-16'''''/5.4.1.3-6:
module-whatis VASP - Vienna Ab-initio Simulation Package
conflict vasp
conflict vasp-vtst
'''''prereq intel/16'''''
'''''prereq impi/5'''''
setenv VASP5_HOME /opt/apps/vasp5/5.4.1.3-6-impi-5-intel-16
prepend-path PATH /opt/apps/vasp5/5.4.1.3-6-impi-5-intel-16/bin
setenv VASP_COMMAND vasp-ase
setenv VASP_PP_PATH /opt/soft/vasp-pot/ase
-------------------------------------------------------------------
:* Therefore, you'd need to add the following lines to your <code>.modules-el6</code> file:
module load '''''intel/16'''''
module load '''''impi/5'''''
module load '''''vasp5/5.4/impi-5'''''
; Expert Tip: The <code>grep</code> command does not work as usual on <code>module show …</code> because of the way <code>module</code> needs to operate. To make grep work, combine the stdout and stderr streams using, in bash, the <code>|&</code> characters to form the pipe:
$ '''module show''' vasp5/5.4/impi-5 <font color="red">|&</font> '''grep''' prereq
prereq intel/16
prereq impi/5
== Minor changes for the ''module'' command ==
=== Determining default module versions ===
To determine which module will be loaded when an abbreviated name is used, inspect the first relevant line in the output of one of these commands:
module show ''name''
module help ''name''
The reason is twofold:
* The <code>module avail</code> command under CentOS-6 no longer issues the marker <code>"(default)"</code> when set for a particular module (which is done administratively using a <code>.version</code> file). I am not sure if this is a bug or by design, but the change makes the output more consistent.
* On the older CentOS-5 system the <code>module</code> command honors <code>.version</code> files ''only for the last component'' of the module. This may lead to different module versions being selected on different systems even when the list of available modules is identical. (Side note: This is a possibly fortuitous bug since openmpi-1.4, used on CentOS-5, sorts after openmpi-1.10.)
=== Name completion on command line ===
When working interactively in a terminal, you can use the "Tab completion" feature of the Bash shell to complete a partially typed module name and show all names available for the name typed so far.
The feature works as follows. At a shell prompt (shown as "$"), type:
$ '''module load fft'''
Press the <code><TAB></code> key and the name will be expanded to <code>fftw3/3.3/</code>, and you'll see all possible completing names, with the cursor waiting at the end of the longest common substring:
$ '''module load fftw3/3.3/'''_
fftw3/3.3/impi-5/intel-16/3.3.4-10        fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11
fftw3/3.3/intel/3.3.2-1                  fftw3/3.3/openmpi-1.4/intel/3.3.2-4
Type the letter <code>o</code>, hit  the <code><TAB></code> key again. The choices will be narrowed down to OpenMPI.
$ '''module load fftw3/3.3/openmpi-1.'''<TAB>
fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11  fftw3/3.3/openmpi-1.4/intel/3.3.2-4
Typing the digit <code>1</code> will pick the <code>1.'''1'''0</code> version, at which point the then remaining single module name choice will be completed all the way, with the cursor waiting after an additional space character:
$ '''module load fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11''' _
=== "module purge" command ===
Previously on ''Carbon'' it was difficult to reset the module selection during an interactive terminal session, <!--, or to make and adjust a custom set of module choices. -->
because the commands for the job queueing system, like <code>qsub</code>, were provided via a module.
You may now safely use the module "purge" command for its intended purpose, as
<source lang="bash">
module purge
</source>
followed by <code>module load …</code> to choose compilers, MPI flavors, and applications.
==== Expert Tip: Purge and reload. ====
You can re-load the customizations from your <code>.modules-*</code> files using the module <code>profile</code>:
<source lang="bash">
module purge
module load profile
</source>

Revision as of 22:26, November 2, 2016

Introduction

(*) Environment modules are the means by which software applications are made available to users. Using modules allows users and administrators to pick, by user or even by compute job, a desired or default version of an application from typically several current and historic versions persisting on the system. Older versions are kept to improve reproducibility of results, an important characteristic of scientific computing.

On Carbon, the environment modules* system has changed in the following aspects, explained further in this document:

  • The naming scheme is more developed and more versatile.
  • Default and dependent modules are no longer being loaded.
  • The module command behaves in slightly different ways.

Motivation

The changes were necessary because of increasing diversity and dependencies of applications, libraries, and the underlying operating system. The goal was to accommodate different compilers, MPI flavors, and (in the future) different aspects of the machine architecture like CPU generation, capabilities, and coprocessor facilities.

For different OS releases the new scheme enables existing application versions to continue being offered where possible, and to make new application versions available where suitable, either on both old and newer OS releases, or only on one.

Naming scheme (Nomenclature)

Overview

The full name of a module has two or more components separated by slashes /, for instance:

name/api/version-build			# binary packages, compilers
name/api/compiler/version-build		# compiled applications
name/api/mpi/compiler/version-build	# compiled applications that use MPI
  • The first component is the applications's main name, usually as chosen by its author.
  • The last component is the also usually author-provided version number, followed by a build identifier chosen on Carbon.
  • Other name components may be present and indicate which set of major tools were used to produce the application locally, which usually implies which modules are required to be loaded to run the application.

Details

  • name is the package's name as chosen on Carbon. It is the name given by the software's author, but lowercased for consistency. It may numbers if customarily used in that way, e.g. fftw3.
  • api is the leading part or parts of the package's version number which typically signifies to suitable precision the API level across which different package versions are expected to be compatible (interchangable in terms of features). api is typically a sole major version number, or has the form major.minor. You may load a module that has the full name foo/m.n/compiler/version-build by the abbreviated name foo/m.n/compiler, which enables you to select the features and binary compatibility level that you need without having to give a complete name all the way down to a build number.
  • compiler is a name component that is present when an application was compiled here and thus usually needs runtime libraries associated with the compiler used. The compiler name component is not strictly needed for applications that are statically linked, but is usually present even then for informative purposes. Conversely, the name component is typically not present for applications installed from binary distribution packages, notably commercial applications and, naturally, compilers themselves.
  • mpi, present when neeeded, denotes the MPI flavor in use for parallel computations.

For reference and contrast, the previous scheme has been the simpler but more opaque:

name/version-build

Name changes

Name changes for most modules

For most modules the leading name component (the part before any "/") will be the same in schemes 1 and 2. What will always differ are the name parts after the first slash, which is relevant if you deliberately (and hopefully with good reason) chose a specific version.

  • To select the latest or administratively designated default version of a package:
    1. On a Carbon command line, list the available flavors and versions, keeping in mind that some older modules were not migrated:
      module avail packagename
    2. In your configuration file, remove version numbers from old names of the formpackagename/version, leaving only packagename.
    3. Append API, MPI, and/or compiler specifications as needed.
    This is the recommended approach, as you will automatically benefit from future updates and maintenance builds.
    For instance, instead of vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-3 write vasp5/5.3/openmpi-1.4
  • To insist on a specific version and build for a package in new-scheme names, which you should do only if you require a build with a specific feature or behavior:
4. Append version and build specifications, as shown by the module avail packagename command above.

Name change exceptions

The names of following modules changed from scheme 1 to 2, making their names more consistent:

scheme 1	scheme 2
-------------------------------------
asap3		asap/3.x		

ase2		ase/2		- deprecated
ase3		ase/3		- not needed as separate module, instead, is installed within each of the new "python-env" modules

g09		gaussian/09
GaussView	gaussview  (lowercase)

python		python-env	- Several suites of Python environments, each with many packages
		python.org	- The interpreter only, from the main Python web site.
Note that the modules fftw3 and vasp5 did not change name, given widespread entrenched use of these names in the packages themselves, as Unix group names, and even in Makefiles of other packages.

Example

Here are the names for the FFT3 library module in the old and new naming schemes, as queried by the module avail shell command:

Scheme 2 (current) Scheme 1 (being retired)
$ module -t avail fftw3
/opt/apps/M/x86_64/EL6:
/opt/apps/M/x86_64/EL:
fftw3/3.3/impi-5/intel-16/3.3.4-10	 # uses Intel-MPI
fftw3/3.3/intel/3.3.2-1			 # legacy, serial only
fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11 # uses OpenMPI
fftw3/3.3/openmpi-1.4/intel/3.3.2-4	 # legacy, OpenMPI
/usr/share/Modules/modulefiles:
/etc/modulefiles:
$ module -t avail fftw3
/opt/soft/modulefiles:
fftw3/3.2.1-1
fftw3/3.2.2-1
fftw3/3.3-1
fftw3/3.3.2-1
fftw3/3.3.2-4
/opt/intel/modulefiles:
/usr/share/Modules/modulefiles:
/etc/modulefiles:
  • Note the MPI flavor and the compiler name components compared to the older naming scheme (bold is used here for illustration only, your output will appear all as regular text.)
  • The -t option of module avail shows the output in "terse" form, one entry per line.
  • Lines ending in : indicate file system directories in which modules are being located on the current node.


Module selection by operating system

For a time, nodes with different operating systems and therefore more or less different module catalogs will coexist in the cluster. Since you will always have the same home directory on each node, most of your script files would have to be written so they run on either operating system. This usually means having to code if statements in your scripts, which can be difficult. To simplify conditional module selection, each node on Carbon now looks for specific file names in your home directory and selectively loads only the file that is appropriate for the operating system running on that node.

  • Files of the form ~/.modules-elx, with x = 5,6,..., will be loaded on the corresponding OS, under naming scheme 2.
  • On CentOS-5 nodes, a file ~/.modules-el5-legacy will be used under scheme 1, but only if ~/.modules-el5 is not present. In other words, a ~/.modules-el5 file has priority and causes ~/.modules-el5-legacy to be ignored.
  • Without any ~/.modules-* files, CentOS-5 nodes will use scheme 1; CentOS-6 nodes always use scheme 2.
  • Your .bashrc will always be read, with the modules naming scheme that was determined by the presence or absence of .modules-* files.
  • When running a PBS job, module commands in the job file will be read, in the same naming scheme as .bashrc.
Tips
  • To load the same modules on any node, in scheme 2:
    • Place your configuration in ~/.modules-el6, then create a symbolic link:
    cd; ln -s .modules-el6 .modules-el5
    • It is possible but not recommended (because less future-proof) to keep your module selection in ~/.bashrc, and activate scheme 2 on CentOS-5 nodes by simply creating an empty configuration: touch ~/.modules-el5

Best Practices

Migration utility

To migrate your existing module selection from .bashrc to .modules-* files, use a helper utility to get you started:

modules-migrate
The utility will manage the following files but does not change module names:
.bashrc
.modules-el5-legacy
.modules-el6
See an example output of running the utility.
  • The utility will give you the opportunity to inspect and edit the resulting files. Use a text editor of your choice, such as nano or vi to re-examine or edit these files further.
  • To switch to scheme 2 on all nodes, create or copy from .modules-el6:
.modules-el5
  • Test your module choices as shown in the next section.

Use suitably abbreviated names

Omit detailed version numbers and build numbers from the end of module names.

This will select the most up-to-date module version at the time of loading. You will benefit from newer modules that have been installed since you last looked. Version numbers are generally chosen so that versions with the same major version number are binary-compatible.

For instance, instead of:

module load intel/16/16.0.0-3
module load openmpi/1.10/intel-16/1.10.0-4

Write:

module load intel/16
module load openmpi/1.10/intel-16

It is preferable to supply the compiler name part of MPI modules (here …/intel-16) because they usually both (a) need compiler libraries and (b) impliclity use their native compiler for further compilations.

Test your interactive shell (login environment)

Same node EL5 node EL6 node
bash -l ssh clogin5 ssh clogin8
module list
exit

To test your new module configuration:

  1. Open another login shell on the current or another node.
  2. Review error messages that might appear before your prompt.
  3. Inspect which modules are loaded.
  4. Edit your .module-* files and address any errors.
  5. Close the test shell and repeat until your desired modules are loaded without errors.

Test in a job file

Your module selection is likely most important in a PBS job file. To avoid the hassle of extended wait times for production jobs, use test jobs with a short walltime limit and place just diagnostic commands in the job script.

Use the module list and type shell commands to verify that all your modules are loaded and that an application is properly callable without full paths.

Example

Consider the following job file modtest.sh:

#!/bin/bash
#PBS -l nodes=1:ppn=1
#PBS -l walltime=0:1:00
#PBS -N modtest

module list

type vasp_gam

Submit the job:

qsub modtest.sh

Alternatively, do the whole thing on the command line, without the need for a separate file:

echo "module list; type vasp5" | qsub -l nodes=1:ppn=1,walltime=0:1:00 -N modtest

In either case, wait for the job to finish, then inspect the output files:

qstat jobnumber
…
head modtest.[eo]1234*

You should see something like:

vasp_gam is /opt/apps/vasp5/5.4.1.3-6-openmpi-1.10-intel-16/bin/vasp_gam

An error looks like:

-bash: type: vasp_gam: not found


Modules load order

  • To resolve module dependencies, edit your configuration or job files to load required modules first, in this order:
    1. compilers
    2. MPI flavor
    3. other libraries that are dynamically loaded.
    4. your desired application(s).

Understanding dependency errors

Learn to recognize error messages from module load when a required module has not been loaded:

Example: A typical error message will look like:

$ module load openmpi/1.10
openmpi/1.10/intel-16/1.10.2-1(27):ERROR:151: Module 'openmpi/1.10/intel-16/1.10.2-1' depends on one of the module(s) 'intel/16/16.0.2 intel/16/16.0.1-2 intel/16/16.0.0-3 intel/16/16.0.0-1 intel/16/16.0.0-0'
openmpi/1.10/intel-16/1.10.2-1(27):ERROR:102: Tcl command execution failed: prereq intel/16
Colors do not appear in the original terminal output but were added here for clarity:
  • The missing prerequisite is the red item on the last line.
  • The modules that would currently satisfy the requirement are shown on the preceding line, indicated here in blue.
  • The full name of the "offending module", deduced from a possibly abbreviated name on the command line, appears in brown.
  • You can inspect the prerequisites of a module in a more succinct manner:
$ module show openmpi/1.10 2>&1 | grep req
prereq	 intel/16
The sequence 2>&1 is necessary so the pipe | captures the entire output of the module show command.

Effect on PBS job submissions

Loading modules in job files

  • You may now safely load modules in PBS job files when using recent MPI modules, both in scheme 1 and scheme 2. Previously, this was not recommended.
Recent builds of OpenMPI (1.4 and 1.10) and Intel MPI now have support compiled in to properly start proccesses on remote nodes.
  • However, best practice is still to load all modules in dotfiles under your home directory.
This will always give you the same applications on both login and compute nodes. Place module commands in job files only when conflicts arise, such as when two of your regularly-used applications require different MPI flavors.

Job routing by operation system

  • TORQUE/PBS jobs that are submitted from a node running CentOS-5 or CentOS-6 will normally be routed to run only on nodes that run the same OS release.
  • Find the eligible OS in the qstat -f output:
$ qstat -f jobnumber | grep opsys
   Resource_List.opsys = el5
  • You may override the automatic selection prior to submission by adding an opsys job resource:
#PBS -l opsys=el5

or:

#PBS -l opsys=el6
  • In a pinch, you may even change the OS request of a queued job by using the qalter command, e.g.:
qalter -l opsys=el6 jobnumber

Using multiple MPI flavors

  • Different MPI flavors can, with caution, be loaded at the same time. This may be necessary because the system is less homogeneous than in the past and no longer uses a single "one true" MPI implementation.
  • When modules of multiple MPI flavors are loaded, call the appropriate MPI commands by a full path specified via the MODULENAME_HOME environment variables that is set by the modules.

Example: In a job file that is to run 2 applications that were compiled with different MPI flavors, write:

$OPENMPI_HOME/bin/mpirun app1_name
$IMPI_HOME/bin/mpirun app2_name

Changes requiring attention

Available modules differ between naming schemes

  • The previous module naming scheme 1 is being retired, along with some of its attendant conventions.
  • Newer applications will primarily be compiled and installed on the newer OS release and in naming scheme 2. Some applications may turn out to be backwards-compatible with a previous OS release, and will be made available there as well, in scheme 2, to appropriately offer applications that run on both or only a specific release of the operating system.
  • Only a subset of modules from scheme 1 has been ported to scheme 2, typically the modules representing the most recent version of an application.

Load all modules yourself

Under scheme 2, you must yourself load all desired modules, particularly compiler and MPI modules, in your shell setup files or in job files (see section below), in suitable order. Modules not depending on others must be loaded first.

This is born of necessity because still useful older applications were compiled with older MPI flavors and versions (typically OpenMPI-1.4) which partially interfere with newer flavors (OpenMPI-1.8, 1.10, or Intel-MPI-5.x). In particular, each MPI flavor provides commands like mpirun and mpifort, and special care is needed to run the correct one if your chosen module set spans different MPI flavors.

No modules are pre-loaded

(**) Technically, the system does load the module profile/user for you. This module only contains the instructions to select and read the appropriate .modules-* file.
  • No modules are pre-loaded by the system**.
Previously, the Intel compilers, the Intel Math Kernel Library, and OpenMPI were pre-loaded for you.

No recursive loading

A module under scheme 2 does not implicitly load other modules that it might depend on, such as modules for compilers, an MPI flavor, or specialized libraries. Previously, this was the case for some popular modules but with the system maturing and diversifying, unexpected consequences can occur easily.

You must load dependent modules yourself. While this may by a minor burden for you at first, your selections should become easier to understand now and easier to adapt later.

See HPC/Modules Best Practices#Load dependent modules first.

Determining prerequisites and load order

To see if an particular application module (such as VASP or Quantum-ESPRESSO) has any prerequisites, inspect the output of module show name, and look for any prereq statements.

Then edit your .modules-* file, load the prereq modules first, followed by the desired application module. Example:

  • Let's load a vasp5 module that uses the Intel-MPI flavor, named "impi":
$ module avail vasp5
------------------------------------------------------------ /opt/apps/M/x86_64/EL ------------------------------------------------------------
vasp5/5.3/openmpi-1.4/intel/5.3.2-mkl-beef-1    vasp5/5.4/impi-5/intel-16/5.4.1.3-6
vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-3       vasp5/5.4/openmpi-1.10/intel-16/5.4.1.3-6
vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-cellz-1
  • Let's try inspecting what it needs:
$ module show vasp5/5.4
-------------------------------------------------------------------
/opt/apps/M/x86_64/EL/vasp5/5.4/openmpi-1.10/intel-16/5.4.1.3-6: 
…
  • Careful: The first output line shows the full file name of the module that would get loaded by the short name. In this case, the abbreviated module name, having no MPI name component, yields a module that uses a different MPI flavor than you want.
  • You will need to be more explicit:
$ module show vasp5/5.4/impi-5
-------------------------------------------------------------------
/opt/apps/M/x86_64/EL/vasp5/5.4/impi-5/intel-16/5.4.1.3-6:

module-whatis	 VASP - Vienna Ab-initio Simulation Package 
conflict	 vasp 
conflict	 vasp-vtst 
prereq	 intel/16
prereq	 impi/5
setenv		 VASP5_HOME /opt/apps/vasp5/5.4.1.3-6-impi-5-intel-16 
prepend-path	 PATH /opt/apps/vasp5/5.4.1.3-6-impi-5-intel-16/bin 
setenv		 VASP_COMMAND vasp-ase 
setenv		 VASP_PP_PATH /opt/soft/vasp-pot/ase 
-------------------------------------------------------------------
  • Therefore, you'd need to add the following lines to your .modules-el6 file:
module load intel/16
module load impi/5
module load vasp5/5.4/impi-5
Expert Tip
The grep command does not work as usual on module show … because of the way module needs to operate. To make grep work, combine the stdout and stderr streams using, in bash, the |& characters to form the pipe:
$ module show vasp5/5.4/impi-5 |& grep prereq
prereq	 intel/16
prereq	 impi/5

Minor changes for the module command

Determining default module versions

To determine which module will be loaded when an abbreviated name is used, inspect the first relevant line in the output of one of these commands:

module show name
module help name

The reason is twofold:

  • The module avail command under CentOS-6 no longer issues the marker "(default)" when set for a particular module (which is done administratively using a .version file). I am not sure if this is a bug or by design, but the change makes the output more consistent.
  • On the older CentOS-5 system the module command honors .version files only for the last component of the module. This may lead to different module versions being selected on different systems even when the list of available modules is identical. (Side note: This is a possibly fortuitous bug since openmpi-1.4, used on CentOS-5, sorts after openmpi-1.10.)

Name completion on command line

When working interactively in a terminal, you can use the "Tab completion" feature of the Bash shell to complete a partially typed module name and show all names available for the name typed so far.

The feature works as follows. At a shell prompt (shown as "$"), type:

$ module load fft

Press the <TAB> key and the name will be expanded to fftw3/3.3/, and you'll see all possible completing names, with the cursor waiting at the end of the longest common substring:

$ module load fftw3/3.3/_
fftw3/3.3/impi-5/intel-16/3.3.4-10        fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11
fftw3/3.3/intel/3.3.2-1                   fftw3/3.3/openmpi-1.4/intel/3.3.2-4

Type the letter o, hit the <TAB> key again. The choices will be narrowed down to OpenMPI.

$ module load fftw3/3.3/openmpi-1.<TAB>
fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11  fftw3/3.3/openmpi-1.4/intel/3.3.2-4

Typing the digit 1 will pick the 1.10 version, at which point the then remaining single module name choice will be completed all the way, with the cursor waiting after an additional space character:

$ module load fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11 _

"module purge" command

Previously on Carbon it was difficult to reset the module selection during an interactive terminal session, because the commands for the job queueing system, like qsub, were provided via a module. You may now safely use the module "purge" command for its intended purpose, as

module purge

followed by module load … to choose compilers, MPI flavors, and applications.

Expert Tip: Purge and reload.

You can re-load the customizations from your .modules-* files using the module profile:

module purge
module load profile