HPC/Module Setup

From CNM Wiki
Jump to navigation Jump to search

 Introduction

Carbon uses the Environment Modules package to dynamically provision software. The package primarily modifies your $PATH and other environment variables.

To select packages for your use, place module load name commands near the end of your ~/.bashrc file.

The Shell

The user shell on Carbon is bash.


Note for long-time Unix/Linux users
[t]csh used to be the only vendor-agnostic and widely available shell with decent command line features. Bash is now much better positioned in this area, and offers consistent programming on the command line and in scripts. Therefore, tcsh is now available only on request (to me, stern), if you absolutely, positively, insist, and know what you're doing. (There will be a quiz.) Here are top ten reasons not to use the C shell, and the ever classic “csh programming considered harmful”. Even though using [t]csh merely for interactive purposes may appear tolerable, it is too tempting to set out from tweaking .cshrc into serious programming. Don't do it. It is not supported. It's dead, Jim.

Shell Customizations

Place customizations in your ~/.bashrc file.

# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi

export PATH=$HOME/mypackage/bin:$PATH
module load name1 name2 …
  • Among the various bash startup files .bashrc is the one relevant when invoked remotely, such as on MPI child nodes reached by sshd.
  • In general, do not place a "module load …" command in a PBS job script.
If you must, note that this command will only work for single-node jobs. It will generally fail for multi-node jobs (nodes > 1). The reason is that the job script is only executed (by the pbs_mom daemon on your behalf) on the first core on the first node of your request. In general, this environment will be cloned for the other cores on the first node, but not for cores on other nodes. There are flags for mpirun or mpiexec to pass some or all environment variables to other MPI processes, but these flags are implementation-specific and may not work reliably.
  • The recommended way is as above, to place module commands in ~/.bashrc. This might preclude job-specific module sets for conflicting modules or tasks. I'm thinking about a proper solution for this. --stern

Modules – General documentation

  • A general introduction to modules can be found at many other sites, such as:
$ module help
…
  Usage: module [ switches ] [ subcommand ] [subcommand-args ]
…

  Available SubCommands and Args:

+ load		modulefile [modulefile ...]
+ unload	modulefile [modulefile ...]
+ switch	[modulefile1] modulefile2.]
+ list

+ avail		[modulefile [modulefile ...]]
+ whatis	[modulefile [modulefile ...]]
+ help		[modulefile [modulefile ...]]
+ show		modulefile [modulefile ..]

For full documentation, consult the manual page:

$ man module

Module Conventions on Carbon

  • Most application software is installed under /opt/soft
  • Package directories are named name-version-build , e.g. /opt/soft/jmol-12.1.37-1.
  • Module names are organized by a mostly version-less name, with the version following after a slash: name/version-build . Using the name component alone is possible and will select a default version for a SubCommand to act upon. Some packages carry a major version number in their name, notably fftw3 and vasp5.
  • module help briefly describes a package, gives its version number (in the module path) and will usually contain a link to its home page.
$ module help jmol

----------- Module Specific Help for 'jmol/12.0.34-1' -------------

	Jmol is a molecule viewer platform for researchers in chemistry and
	biochemistry, implemented in Java for multi-platform use.  This is the
	standalone application.  It offers high-performance 3D rendering with
	no hardware requirements and supports many popular file formats.

	http://jmol.sourceforge.net/
	http://wiki.jmol.org/
  • Default modules are loaded by /etc/profile.d/zz-moduleuse.sh. (The strange name form ensures this profile segment is loaded last.) They currently are:
$ module list
Currently Loaded Modulefiles:
  1) moab/6.0.3-1              4) icc/11/11.1.073
  2) gold/2.1.11.0-4           5) ifort/11/11.1.073
  3) openmpi/1.4.3-intel11-1   6) mkl/10.2.6.038

Package home directories

A package directory usually contains Unix-style subdirectories for the various files, which the modulefile usually automatically integrates into your user environment by means of standard environment variables.

bin/
the main package executable and associated tools and utility scripts; added to $PATH.
lib/
static and shared libraries; added to $LD_LIBRARY_PATH if it contains shared libs.
man/, share/man/, doc/, share/doc/
man pages (added to $MANPATH) and human-readable documentation.
include/
C header files and other script integration files for using library packages; added to $INCLUDE.
… and others.

Further, most modules will set a convenience variable $NAME_HOME which points to the toplevel directory. Dashes in the package name are converted to underscores. This is mostly useful to inspect documentation and auxiliary files:

$ module load quantum-espresso
 $ ls -F $QUANTUM_ESPRESSO_HOME/doc
 Doc/  atomic_doc/  examples/
… and to specifcy link paths in makefiles:
$ module load fftw3
 LDFLAGS += -L$(FFTW3_HOME)/lib

Package-specific sample jobs

Packages that require more than the standard Carbon job template contain a sample job in the toplevel directory:

module load quantum-espresso
cat $QUANTUM_ESPRESSO_HOME/*.job

which gives:

#!/bin/bash
# Job template for Quantum ESPRESSO 4.x on Carbon
#
#PBS -l nodes=1:ppn=8
#PBS -l walltime=1:00:00
#PBS -N qe_jobname
#PBS -A cnm12345
#
# Output and error log:
#PBS -o job.out
#PBS -e job.err
#
## send mail at begin, end, abort, or never (b, e, a, n):
#PBS -m ea

cd $PBS_O_WORKDIR

# job-specific tmp directories -- each is node-local, wiped on exit
export ESPRESSO_TMPDIR=$TMPDIR

mpirun -x ESPRESSO_TMPDIR \
	-machinefile  $PBS_NODEFILE \
	-np $(wc -l < $PBS_NODEFILE) \
	pw.x \
	< input.txt > output.txt