HPC/Module naming scheme 2016: Difference between revisions

From CNM Wiki
< HPC
Jump to navigation Jump to search
Line 17: Line 17:


== Naming scheme (Nomenclature) ==
== Naming scheme (Nomenclature) ==
The full name of a module has several name components that are separated by a slash <code>/</code>. The applications's main name and author-provided version, along with a local build number, form the first and last component of the module name. Other name components, if present, make apparent to the user which set of major tools was used to produce the application locally,
The full name of a module has several name components that are separated by a slash <code>/</code>. The applications's main name and author-provided version, along with a local build number, form the first and last component of the module name, respectively. Other name components, if present, make apparent to the user which set of major tools was used to produce the application locally,
which usually translates to which modules are needed to run the application.
which usually translates to which modules are needed to run the application.


Line 95: Line 95:
-->
-->
: The modules <code>fftw3</code> and <code>vasp5</code> did ''not'' change name due to more entrenched usage in the package itself, Unix group names, and compilation dependencies.
: The modules <code>fftw3</code> and <code>vasp5</code> did ''not'' change name due to more entrenched usage in the package itself, Unix group names, and compilation dependencies.


== Active naming scheme varies by ''operating system'' ==
== Active naming scheme varies by ''operating system'' ==

Revision as of 21:23, March 22, 2016

Introduction

(*) Environment modules are the means by which software applications are made available to users. Using modules allows users and administrators to pick, by user or even by compute job, a desired or default version of an application from typically several current and historic versions persisting on the system. Older versions are kept to improve reproducibility of results, an important characteristic of scientific computing.

On Carbon, the environment modules* system has changed in the following aspects:

  • the naming scheme is more developed,
  • default and dependent modules are no longer being loaded,
  • commands behave in slightly different and better ways.

The changes were necessary because of increasing diversity and interdependence of applications, their modules, and the underlying operating system. The goal was to accommodate different compilers, MPI flavors, and (for future evolution) different aspects of the machine architecture like CPU generation, capabilities, and coprocessor facilities. Further, for different OS releases the new scheme enables existing application versions to continue functioning where possible, and to make available new application versions where suitable, either on both old and newer OS releases, or only one of them.

The naming scheme and usage conventions for environment modules on Carbon are as follows.

Naming scheme (Nomenclature)

The full name of a module has several name components that are separated by a slash /. The applications's main name and author-provided version, along with a local build number, form the first and last component of the module name, respectively. Other name components, if present, make apparent to the user which set of major tools was used to produce the application locally, which usually translates to which modules are needed to run the application.

In detail, module names have one of the following forms:

name/api/version-build			# binary packages, compilers
name/api/compiler/version-build		# compiled applications
name/api/mpi/compiler/version-build	# compiled applications that use MPI
  • name is the package's name as chosen on Carbon. It is usually all lowercase and may contain a number if so named by the software author, e.g. fftw3.
  • api is the leading part of the pacakge version, which typically signifies the API level across which different versions are expected to be compatible. It is typically a sole major version number, or has the form major.minor. Loading a module that has the full name foo/m.n/compiler/version-build by the abbreviated name foo/m.n enables you to select the features and binary compatibility level that you need without having to give a complete module name all the way down to a build number.
  • compiler is a component that is present only as needed when an application was compiled here and needs runtime libraries associated with the compiler used. The compiler name component is typically not present for applications installed from binary packages, notably commercial applications and, naturally, compilers themselves.
  • mpi, also as neeeded, is present to denote an MPI flavor in use for parallel computations.

For reference, the previous scheme has been as follows:

name/version-build

Example: Names for the FFT3 library module in the old and new naming schemes, queried by the module avail shell command:

Scheme 2 (current) Scheme 1 (being retired)
$ module -t avail fftw3
/opt/apps/M/x86_64/EL6:
/opt/apps/M/x86_64/EL:
fftw3/3.3/impi-5/intel/3.3.4-10		# uses Intel-MPI
fftw3/3.3/intel/3.3.2-1			# legacy, serial only
fftw3/3.3/openmpi-1.10/intel/3.3.4-11	# uses OpenMPI
fftw3/3.3/openmpi-1.4/intel/3.3.2-4	# legacy, OpenMPI
/usr/share/Modules/modulefiles:
/etc/modulefiles:
$ module -t avail fftw3
/opt/soft/modulefiles:
fftw3/3.2.1-1
fftw3/3.2.2-1
fftw3/3.3-1
fftw3/3.3.2-1
fftw3/3.3.2-4
/opt/intel/modulefiles:
/usr/share/Modules/modulefiles:
/etc/modulefiles:
The -t option of module avail shows the output in "terse" form, one entry per line. Lines ending in ":" are the search directories traversed, as defined by module use dirname statements.


You may need to adapt the module names that you placed in your shell startup and job files to the new more hierarchical scheme. For most modules, with exceptions shown below, the leading name component (the part before any "/") is the same in the old and new naming schemes. What always differs are the name parts after the first slash.

Name change rules for most modules

  • To use the latest or automatically selected version of a package, remove version numbers from old-style module names of the formpackagename/version, leaving only packagename. This is the recommended approach, as you will automatically benefit from future updates and maintenance builds.
  • To insist on a specific version for a package in new style names:
    • Inspect the available flavors and versions (some older modules were not migrated):
      module avail packagename.
    • Choose the new-style name up to the desired specificity. You may leave out trailing name or directory parts.
      For instance, instead of vasp5/5.3/openmpi-1.4/intel/5.3.3p3-mkl-3 you may write vasp5/5.3/openmpi-1.4 or vasp5/5.3, letting the system choose the versions for MPI and compiler that are chosen as defaults at a given time.

Name change exceptions

For the following modules the newer naming convention allowed for and thus uses more consistent names:

OLD		NEW
-------------------------------------
asap3		asap/3

ase2		ase/2
ase3		ase/3

g09		gaussian/09
GaussView	gaussview  (lowercase)
The modules fftw3 and vasp5 did not change name due to more entrenched usage in the package itself, Unix group names, and compilation dependencies.

Active naming scheme varies by operating system

During a transition period, nodes with different operating systems will coexist in the cluster. Each user, nonetheless, will see the same networked home directory on each node regardless of the OS that the node runs. Therefore, the user's shell and module configuration files (like .bashrc and others) will be interpreted by system utilities, end user applications, and in runtime environments that all differ by operating system, to a varying degree.

Solution

The module scheme that will be used on a node is primarily determined by its operating system, secondarily by the user:

  • On nodes running CentOS-6, both login and compute, scheme 2 is used always.
  • On CentOS-5 nodes, scheme 1 is normally used. A user can customize or upgrade to scheme 2 by creating configuration files which will be read within the scheme they activate.

Caveats

  • The home directory is the same across nodes, and therefore a user's configuration scripts are sensitive to differences between OS releases.
  • Applications will typically be compiled and installed on a node that runs the more recent OS release. Some compiled applications are backwards-compatible with a previous OS release, and will be made available there. If we find in practice that this fails, the respective module will be moved such that it is visible only on the suitable OS release. Scheme 2 will appropriately offer applications that run on both or only a specific release of the operating system.

Scheme selection rules

You have files:
.bashrc and …
Remark CentOS-6 uses: CentOS-5 uses:
files module names files module names
Starting situation. .bashrc only scheme 2 .bashrc only scheme 1
.modules-2 Switch over, recommended. .modules-2 and .bashrc .modules-2 and .bashrc scheme 2
.modules-1 Only recommended during transition. .bashrc only .modules-1 and .bashrc scheme 1
.modules-1 .modules-2 For advanced users. .modules-2 and .bashrc .modules-1 and .bashrc scheme 1

Available modules differ between naming schemes

  • Newer modules will only be made available in scheme 2.
  • A subset of modules from scheme 1 have been ported to scheme 2, typically only modules representing the most recent version of an application.
  • The previous module naming scheme 1 is being retired, along with some of its attendant conventions.

No default modules loaded by the system

  • Users must load all applicable modules in their shell setup files or job files, in suitable order.
This is born of necessity because still useful older applications were compiled with older MPI versions (typically OpenMPI-1.4) which partially interfere with newer versions (OpenMPI-1.8, 1.10, or Intel-MPI-5.x). In particular, commands like mpirun and mpifort are typically provided by each MPI flavor.

Modules do not load dependencies

  • A module does not implicitly load other modules that it might depend on, such as modules for compilers, an MPI flavor, or specialized libraries. While a minor burden for the user to specify, this will make operations more explicit and predictable.
  • Learn to recognize error messages issued by module load when a dependent module is found missing. For instance:
$ module load openmpi/1.10
openmpi/1.10/intel-16/1.10.2-1(27):ERROR:151: Module 'openmpi/1.10/intel-16/1.10.2-1' depends on one of the module(s) 'intel/16/16.0.2 intel/16/16.0.1-2 intel/16/16.0.0-3 intel/16/16.0.0-1 intel/16/16.0.0-0'
openmpi/1.10/intel-16/1.10.2-1(27):ERROR:102: Tcl command execution failed: prereq intel/16
Colors do not appear in the original terminal output but were added here for clarity. The red item in the last line shows the prerequisite as stated in the text of the module attempted to be loaded. The currently available modules that would satisfy the requirement are listed in blue. The full name of the offending module that was located from the abbreviated name given in the command appears in brown.
  • You must load prerequisite modules yourself before modules that depend on them, by full or (recommended) abbreviated name. The typical order is: compiler, MPI, other (dynamic) libraries, and finally your intended module. In the example above, a suitable command would be:
module load  intel/16  openmpi/1.10/intel-16
Note how in this case it is preferable to explicitly supply the compiler name component …/intel-16 of the openmpi module to ensure it is consonant with the compiler.
  • Different MPI flavors can (and may have to) be loaded at the same time. In this situation, MPI commands will have to be called by absolute path, e.g. $OPENMPI_HOME/bin/mpirun …
Again, this is because the system is less homogeneous than in the past: it is impractical or even impracticable to upgrade and maintain applications at a single "one true" MPI implementation.

Migration guide

Or: How to customize your module selection by OS release.

Configure in .bashrc

To switch over to new-style module names entirely, on both CentOS releases, you could continue using only ~/.bashrc. To do this:

  • Tell CentOS-5 nodes to offer you the new-style module catalog instead of the old one. To do so, simply create an empty customization file:
touch ~/.modules-2
vi .bashrc
# or:
nano .bashrc
Use a text editor of your choice, such as nano or vi.

Configure in dedicated files

It is perhaps cleaner to perform the module selection in files dedicated for each CentOS release, so that you can make adjustments independently.

To migrate your existing configuration, manage the following files:

.bashrc
.modules-1
.modules-2

Use a helper application to get you started:

modules-migrate

HPC/Migration example


Minor changes for the module command

Name completion in command line

When working interactively in a terminal, you can use a "completion" feature of the Bash shell to complete a partially typed module name and show all names available for the name typed so far. For example:

At a shell prompt (shown as "$"), type:

$ module load fft

Press the <TAB> key and the name will be expanded to fftw3/ and you'll see two possible completing names, with the cursor waiting at the end of the longest common substring:

$ module load fftw3/_
fftw3/3.3/impi-5/intel-16/3.3.4-10        fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11  
fftw3/3.3/intel/3.3.2-1                   fftw3/3.3/openmpi-1.4/intel/3.3.2-4       

Type the letter o, hit the <TAB> key again. The choices will be narrowed down to OpenMPI.

$ module load fftw3/3.3/openmpi-1.<TAB>
fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11  fftw3/3.3/openmpi-1.4/intel/3.3.2-4

Typing the digit 1 will pick the 1.10 MPI version, at which the then remaining single module name choice will be completed, with the cursor waiting after an additional space character:

$ module load fftw3/3.3/openmpi-1.10/intel-16/3.3.4-11 _

"module purge" command

Previously, the Carbon queueing system was made available to users by modules, which meant that it was difficult for a user to start with a clean slate, or to make and adjust a custom set of module choices. You may now safely use the module "purge" command for its intended purpose:

module purge

followed by module load … to choose compilers, MPI flavors, and applications.


Determining default module versions

The module avail command under CentOS-6 no longer includes the marker "(default)" when one has been set in a .version file. I am not sure if this is a bug or by design, but the change certainly makes the output more consistent.

To determine which module will be loaded when an abbreviated name is used, I recommend to inspect the first relevant line in the output of one of these commands:

module show name
module help name