HPC/Module naming scheme 2016
Introduction
(*) Environment modules are the means by which software applications are made available to users. Using modules allows users and administrators to pick, by user or even by compute job, a desired or default version of an application from typically several current and historic versions persisting on the system. Older versions are kept to improve reproducibility of results, an important characteristic of scientific computing. |
On Carbon, the environment modules* system has changed in the following aspects, all explained further in this document:
- The naming scheme is more developed and more versatile.
- Default and dependent modules are no longer being loaded.
- The
module
command behaves in slightly different ways.
Motivation
The changes were necessary because of increasing diversity and interdependence of applications, their modules, and the underlying operating system. The goal was to accommodate different compilers, MPI flavors, and (in the future) different aspects of the machine architecture like CPU generation, capabilities, and coprocessor facilities.
Further, for different OS releases the new scheme enables existing application versions to continue being offered where possible, and to make new application versions available where suitable, either on both old and newer OS releases, or only on one.
Naming scheme (Nomenclature)
The general module naming scheme is as follows:
- The full name of a module has several components that are separated by a slash
/
. - The first and last component of the module name, respectively, are formed by the applications's typically author-provided main name and version, along with a build number or identifier that is local to Carbon.
- Other name components may be present in-between and make apparent to the user which set of major tools was used to produce the application locally, which usually translates to which modules must be loaded to run the application.
In detail, module names have one of the following forms and typical use cases:
name/api/version-build # binary packages, compilers name/api/compiler/version-build # compiled applications name/api/mpi/compiler/version-build # compiled applications that use MPI
name
is the package's name as chosen on Carbon. It is usually the name given by the software's author, but lowercased for consistency, and it may contain a number if conventionally so named by the author, e.g.fftw3
.api
is the leading part or parts of the package's version number which typically signifies to suitable precision the API level across which different package versions are expected to be compatible (interchangable in terms of features).api
is typically a sole major version number, or has the form major.minor. You may load a module that has the full namefoo/m.n/compiler/version-build
by the abbreviated namefoo/m.n/compiler
, which enables you to select the features and binary compatibility level that you need without having to give a complete name all the way down to a build number.compiler
is a name component that is present when an application was compiled here and thus usually needs runtime libraries associated with the compiler used. Thecompiler
name component is not strictly needed for applications that are statically linked, but is usually present even then for informative purposes. Conversely, the name component is typically not present for applications installed from binary distribution packages, notably commercial applications and, naturally, compilers themselves.mpi
, present when neeeded, denotes the MPI flavor in use for parallel computations.
For reference and contrast, the previous scheme has been the simpler but more opaque:
name/version-build
Example: Names for the FFT3 library module in the old and new naming schemes, queried by the module avail
shell command:
Scheme 2 (current) | Scheme 1 (being retired) |
---|---|
$ module -t avail fftw3 /opt/apps/M/x86_64/EL6: /opt/apps/M/x86_64/EL: fftw3/3.3/impi-5/intel-16/3.3.4-10 # uses Intel-MPI fftw3/3.3/intel/3.3.2-1 # legacy, serial only fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11 # uses OpenMPI fftw3/3.3/openmpi-1.4/intel/3.3.2-4 # legacy, OpenMPI /usr/share/Modules/modulefiles: /etc/modulefiles: |
$ module -t avail fftw3 /opt/soft/modulefiles: fftw3/3.2.1-1 fftw3/3.2.2-1 fftw3/3.3-1 fftw3/3.3.2-1 fftw3/3.3.2-4 /opt/intel/modulefiles: /usr/share/Modules/modulefiles: /etc/modulefiles: |
- Note the MPI flavor and the compiler name components compared to the older naming scheme. (Boldface shown here for illustration only, your output will appear all as regular text.)
- The
-t
option ofmodule avail
shows the output in "terse" form, one entry per line. - Lines ending in
:
indicate file system directories in which modules are being located on the current node.
Watch out: Changes requiring your attention
The nitty-gritty details of the changes are listed separately.
Migration guide – How to customize your module selection by OS release
When adapting your existing module commands, you could continue using only the ~/.bashrc
file which customarily held these commands.
However, is is recommended to break out your module selection into files that are specific to OS release and module naming scheme.
Later on, having separate files will make the scope of changes more obvious.
- To migrate your existing configuration, use a helper utility to get you started:
modules-migrate
- The utility will manage the following files but does not change any module names – you must afterwards inspect and edit these files:
.bashrc .modules-1 .modules-el6
- See an example output of running the helper utility.
- To use scheme 2 on CentOS-5 machines, manually copy or create:
.modules-el5
- Use a text editor of your choice, such as
nano
orvi
to re-examine and edit these files. - Test your module choices as shown in the next section.
Test your module choices
Interactive test
Same node | EL5 node | EL6 node |
---|---|---|
bash -l
|
ssh clogin5
|
ssh clogin7
|
module list
| ||
exit
|
To test your new module configuration:
- Open another login shell on the current or another node.
- Review error messages that might appear before your prompt.
- Inspect which modules are loaded.
- Edit your .module-* files to mitigate any errors.
- Close the testing shell and repeat until your desired modules are loaded without errors.
Test in a job file
Your module selection is likely most important in a PBS job file. Avoid the hassle of extended wait times for production jobs by using test jobs with a short walltime limit and job bodies with merely diagnostic commands. Use the module list
and type
shell commands to verify that all your modules are loaded and to determine if an application is properly callable without full paths.
- Example
Consider the following job file modtest.sh
:
#!/bin/bash
#PBS -l nodes=1:ppn=1
#PBS -l walltime=0:1:00
#PBS -N modtest
module list
type vasp_gam
Submit the job:
qsub modtest
Alternatively, do the whole thing on the command line, without the need for a separate file:
echo "module list; type vasp5" | qsub -l nodes=1:ppn=1,walltime=0:1:00 -N modtest
In either case, wait for the job to finish, then inspect the output files:
qstat jobnumber … head modtest.[eo]1234*
You should see something like:
vasp_gam is /opt/apps/vasp5/5.4.1.3-6-openmpi-1.10-intel-16/bin/vasp_gam
An error looks like:
-bash: type: vasp_gam: not found