HPC/Module naming scheme 2016

From CNM Wiki
Jump to navigation Jump to search

Introduction

(*) Environment modules are the means by which software applications are made available to users. Using modules allows users and administrators to pick, by user or even by compute job, a desired or default version of an application from typically several current and historic versions persisting on the system. Older versions are kept to improve reproducibility of results, an important characteristic of scientific computing.

On Carbon, the environment modules* system has changed in the following aspects, all explained further in this document:

  • The naming scheme is more developed and more versatile.
  • Default and dependent modules are no longer being loaded.
  • The module command behaves in slightly different ways.

Motivation

The changes were necessary because of increasing diversity and interdependence of applications, their modules, and the underlying operating system. The goal was to accommodate different compilers, MPI flavors, and (in the future) different aspects of the machine architecture like CPU generation, capabilities, and coprocessor facilities.

Further, for different OS releases the new scheme enables existing application versions to continue being offered where possible, and to make new application versions available where suitable, either on both old and newer OS releases, or only on one.

Naming scheme (Nomenclature)

The general module naming scheme is as follows:

  • The full name of a module has several components that are separated by a slash /.
  • The first and last component of the module name, respectively, are formed by the applications's typically author-provided main name and version, along with a build number or identifier that is local to Carbon.
  • Other name components may be present in-between and make apparent to the user which set of major tools was used to produce the application locally, which usually translates to which modules must be loaded to run the application.

In detail, module names have one of the following forms and typical use cases:

name/api/version-build			# binary packages, compilers
name/api/compiler/version-build		# compiled applications
name/api/mpi/compiler/version-build	# compiled applications that use MPI
  • name is the package's name as chosen on Carbon. It is usually the name given by the software's author, but lowercased for consistency, and it may contain a number if conventionally so named by the author, e.g. fftw3.
  • api is the leading part or parts of the package's version number which typically signifies to suitable precision the API level across which different package versions are expected to be compatible (interchangable in terms of features). api is typically a sole major version number, or has the form major.minor. You may load a module that has the full name foo/m.n/compiler/version-build by the abbreviated name foo/m.n/compiler, which enables you to select the features and binary compatibility level that you need without having to give a complete name all the way down to a build number.
  • compiler is a name component that is present when an application was compiled here and thus usually needs runtime libraries associated with the compiler used. The compiler name component is not strictly needed for applications that are statically linked, but is usually present even then for informative purposes. Conversely, the name component is typically not present for applications installed from binary distribution packages, notably commercial applications and, naturally, compilers themselves.
  • mpi, present when neeeded, denotes the MPI flavor in use for parallel computations.

For reference and contrast, the previous scheme has been the simpler but more opaque:

name/version-build

Example: Names for the FFT3 library module in the old and new naming schemes, queried by the module avail shell command:

Scheme 2 (current) Scheme 1 (being retired)
$ module -t avail fftw3
/opt/apps/M/x86_64/EL6:
/opt/apps/M/x86_64/EL:
fftw3/3.3/impi-5/intel-16/3.3.4-10	 # uses Intel-MPI
fftw3/3.3/intel/3.3.2-1			 # legacy, serial only
fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11 # uses OpenMPI
fftw3/3.3/openmpi-1.4/intel/3.3.2-4	 # legacy, OpenMPI
/usr/share/Modules/modulefiles:
/etc/modulefiles:
$ module -t avail fftw3
/opt/soft/modulefiles:
fftw3/3.2.1-1
fftw3/3.2.2-1
fftw3/3.3-1
fftw3/3.3.2-1
fftw3/3.3.2-4
/opt/intel/modulefiles:
/usr/share/Modules/modulefiles:
/etc/modulefiles:
  • Note the MPI flavor and the compiler name components compared to the older naming scheme. (Boldface shown here for illustration only, your output will appear all as regular text.)
  • The -t option of module avail shows the output in "terse" form, one entry per line.
  • Lines ending in : indicate file system directories in which modules are being located on the current node.

Watch out: Changes requiring your attention

See separate page.

Migration guide – How to customize your module selection by OS release

In adapting your existing module selection, you have two choices:

Configure in dedicated files

It is cleanest to perform the module selection in separate files, one for each OS release. This allows you to change files later on with minimal interference.

  • To migrate your existing configuration, use a helper utility to get you started:
modules-migrate

This utility will manage the following files, which you should afterwards inspect and edit:

.bashrc
.modules-1
.modules-2

See example output of running the helper.

Configure in .bashrc

To switch over to new-style module names entirely, on both CentOS releases, you could continue using only ~/.bashrc. To do this:

  • Tell CentOS-5 nodes to offer you the new-style module catalog instead of the old one. To do so, simply create the new-scheme customization file but leave it empty:
touch ~/.modules-2
  • Then, apply the changes shown in the nomenclature section above:
vi .bashrc
# or:
nano .bashrc
Use a text editor of your choice, such as nano or vi.
  • Test – see next section.

Test your module choices

Same node EL5 node EL6 node
bash -l ssh clogin5 ssh clogin7
module list
exit

To test your new module configuration:

  1. Open another login shell on the current or another node.
  2. Review error messages that might appear before your prompt.
  3. Inspect which modules are loaded.
  4. Edit your .module-* files to mitigate any errors.
  5. Close the testing shell and repeat.

Minor changes for the module command

Determining default module versions

To determine which module will be loaded when an abbreviated name is used, inspect the first relevant line in the output of one of these commands:

module show name
module help name

The reason is twofold:

  • The module avail command under CentOS-6 no longer issues the marker "(default)" when set for a particular module (which is done administratively using a .version file). I am not sure if this is a bug or by design, but the change makes the output more consistent.
  • On the older CentOS-5 system the module command honors .version files only for the last component of the module. This may lead to different module versions being selected on different systems even when the list of available modules is identical. (Side note: This is a possibly fortuitous bug since openmpi-1.4, used on CentOS-5, sorts after openmpi-1.10.)

Name completion on command line

When working interactively in a terminal, you can use the "Tab completion" feature of the Bash shell to complete a partially typed module name and show all names available for the name typed so far.

The feature works as follows. At a shell prompt (shown as "$"), type:

$ module load fft

Press the <TAB> key and the name will be expanded to fftw3/3.3/, and you'll see all possible completing names, with the cursor waiting at the end of the longest common substring:

$ module load fftw3/3.3/_
fftw3/3.3/impi-5/intel-16/3.3.4-10        fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11
fftw3/3.3/intel/3.3.2-1                   fftw3/3.3/openmpi-1.4/intel/3.3.2-4

Type the letter o, hit the <TAB> key again. The choices will be narrowed down to OpenMPI.

$ module load fftw3/3.3/openmpi-1.<TAB>
fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11  fftw3/3.3/openmpi-1.4/intel/3.3.2-4

Typing the digit 1 will pick the 1.10 version, at which point the then remaining single module name choice will be completed all the way, with the cursor waiting after an additional space character:

$ module load fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11 _

"module purge" command

Previously on Carbon it was difficult to reset the module selection during an interactive terminal session, because the commands for the job queueing system, like qsub, were provided via a module. You may now safely use the module "purge" command for its intended purpose, as

module purge

followed by module load … to choose compilers, MPI flavors, and applications.

Expert Tip: Purge and reload.

You can re-load the customizations from your .modules-1 or .modules-2 files using the module profile:

module purge
module load profile