HPC/Submitting and Managing Jobs/Example Job Script: Difference between revisions

From CNM Wiki
Jump to navigation Jump to search
Line 166: Line 166:
  Installment
  Installment
  number
  number
         ^
         ^
        |============
        |========
         |====
         |====
        |========
        |============
         |------------|------> Time
         |------------|------> Time
     proposal    proposal
     proposal    proposal
       start        end
       start        end

Revision as of 19:14, May 18, 2012

Processing flow of a typical job.

Introduction

A Torque job script is usually a shell script that begins with PBS directives. Directives are comment lines to the shell but are interpreted by Torque, and take the form

#PBS qsub_options

The job script is read by the qsub job submission program, which interpretes the directives and accordingly places the job in the queue. Review the qsub man page to learn about the options accepted and also the environment variables provided to the job script later at execution.

Note
Place directives only at the beginning of the job script. Torque ignores directives after the first executable statement in the script. Empty lines are allowed, but not recommended. Best practice is to have a single block of directives at the beginning of the file.
Advanced usage
  • The job script may be written in any scripting language, such as Perl or Python. The interpreter is specified in the first script line in Unix hash-bang syntax #!/usr/bin/perl, or using the qsub -S path_list option.
  • The default directive token #PBS can be changed or unset entirely with the qsub -C option; see qsub, sec. Extended Description.

Application-specific job scripts

For most scalar and MPI-based parallel jobs on Carbon the scripts in the next section will be appropriate. Some applications however require customizations, typically copying $TMPDIR to an app-specific variable, or calling MPI through app-specifc wrapper scripts. Such application-specific custom scripts are located at the root directory of the application under /opt/soft, and can typically be reached as:

$APPNAME_HOME/APPNAME.job

or

$APPNAME_HOME/sample.job

where APPNAME is the module name all UPPERCASED and with "-" (minus) characters replaced by "_" (underscore).

To find variables of this form:

env | grep _HOME

Generic job scipts

Here are a few example jobs for the most common tasks. Note how the PBS directives are mostly independent from the type of job, except for the node specification.

OpenMPI, InfiniBand

This is the default user environment. The openmpi and icc, ifort, mkl modules are preloaded in the system's shell startup files. The InfiniBand fast interconnect is selected in the openmpi module by means of the environment variable $OMPI_MCA_btl.

#!/bin/bash
#
#  Basics: Number of nodes, processors per node (ppn), and walltime (hhh:mm:ss)
#PBS -l nodes=5:ppn=8
#PBS -l walltime=0:10:00
#PBS -N job_name
#PBS -A account
#
#  File names for stdout and stderr.  If not set here, the defaults
# are <JOBNAME>.o<JOBNUM> and <JOBNAME>.e<JOBNUM>
#PBS -o job.out
#PBS -e job.err
#
#  Send mail at begin, end, abort, or never (b, e, a, n). Default is "a".
#PBS -m ea

# change into the directory where qsub will be executed
cd $PBS_O_WORKDIR

# start MPI job over default interconnect; count allocated cores on the fly.
mpirun -machinefile  $PBS_NODEFILE \
       -np $(wc -l < $PBS_NODEFILE) \
        programname
  • If your program reads from files or takes options and/or arguments, use and adjust one of the following forms:
mpirun -machinefile  $PBS_NODEFILE \
       -np $(wc -l < $PBS_NODEFILE) \
       programname  < run.in
mpirun -machinefile  $PBS_NODEFILE \
       -np $(wc -l < $PBS_NODEFILE) \
       programname  -options arguments < run.in
mpirun -machinefile  $PBS_NODEFILE \
       -np $(wc -l < $PBS_NODEFILE) \
       programname < run.in > run.out 2> run.err
In the last form, anything after programname is optional. If you use specific redirections for stdout or stderr as shown (>, 2>), the job-global files job.out, job.err declared earlier will remain empty or only contain output from your shell startup files (which should really be silent), and the rest of your job script.

OpenMPI, Ethernet

To select ethernet transport, such as for embarrasingly parallel jobs, specify an -mca option:

mpirun -machinefile $PBS_NODEFILE -np $nprocs \
	-mca btl self,tcp \
        programname

Intel-MPI

Under Intel-MPI the job script will be:

#!/bin/bash
#PBS ... (same as above)

cd $PBS_O_WORKDIR

mpiexec.hydra \
	-machinefile  $PBS_NODEFILE \
	-np $(wc -l < $PBS_NODEFILE) \
	programname


The account parameter

The parameter for option -A account can take the following forms:

cnm23456
for most jobs, containing your 5-digit proposal number.
user
(the actual string "user", not your user name) for a limited personal startup allocation.
staff
for discretionary access by CNM staff.

You can check your account balance in hours as follows:

mybalance -h
gbalance -u $USER -h
The relevant column is Available, accounting for amounts reserved by current jobs and credits.

Timed Allocations

The compute time physically available by Carbon's processors is a perishable resource. Hence, your allocations are time-restricted in a use-it-or-lose-it manner. This is done to encourage consistent use of the machine throughout allocation cycles.

The current scheme is as follows:

  • Your allocation is provided in three installments.
  • All installments are active from the beginning.
  • Installments expire in a staggered fashion, currently after 4, 8, and 12 months, repsectively. A diagram might illuminate this:
Installment
number

       ^
       |============
       |========
       |====
       |------------|------> Time

    proposal     proposal
      start        end

       "-" indicates 1 month.

Advanced node selection

You can refine the node selection (normally done via the PBS resource -l nodes=…) to finely control your node and core allocations. You may need to do so for the following reasons:

  • select specific node hardware generations,
  • permit shared vs. exclusive node access,
  • vary PPN across the nodes of a job,
  • accommodate multithreading (OpenMP).

See HPC/Submitting Jobs/Advanced node selection for these topics.