HPC/Submitting and Managing Jobs/Example Job Script: Difference between revisions
m (→Introduction) |
|||
Line 86: | Line 86: | ||
-np $(wc -l < $PBS_NODEFILE) \ | -np $(wc -l < $PBS_NODEFILE) \ | ||
programname | programname | ||
</syntaxhighlight> | |||
=== Other scripting languages === | |||
The job script is usually written in a Unix command shell language, typically ''bash''. | |||
However, any scripting language can be used. | |||
The only pieces that Torque reads are the Torque directives, up to the first actual command, i.e., a line which is (a) not empty, (b) not a directive, and (c) not a "#"-style comment. | |||
The script interpreter is chosen by the Linux kernel from the first line of the script. For instance, a job script for the Python language would typically begin like this: | |||
<syntaxhighlight lang="python"> | |||
#!/usr/bin/python | |||
# Sample PBS job script, using python directly. | |||
# | |||
# Basics: Number of nodes, processors per node (ppn), and walltime [dd:]hh:mm:ss | |||
#PBS -l nodes=3:ppn=8 | |||
#PBS -l walltime=0:10:00 | |||
#PBS -N job_name | |||
#PBS -A account | |||
# | |||
# File names for stdout and stderr. If not set here, the defaults | |||
# are <JOBNAME>.o<JOBNUM> and <JOBNAME>.e<JOBNUM> | |||
#PBS -o job.out | |||
#PBS -e job.err | |||
# | |||
# Send mail at begin, end, abort, or never (b, e, a, n). Default is "a". | |||
#PBS -m ea | |||
import os | |||
os.chdir(os.environ['PBS_O_WORKDIR']) | |||
… | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Revision as of 20:06, January 15, 2012
Introduction
A Torque job script is usually a shell script that begins with PBS directives. Directives are comment lines to the shell but are interpreted by Torque, and take the form
#PBS qsub_options
The job script is read by the qsub job submission program, which interpretes the directives and accordingly places the job in the queue. Review the qsub man page to learn about the options accepted and also the environment variables provided to the job script later at execution.
- Note
- Place directives only at the beginning of the job script. Torque ignores directives after the first executable statement in the script. Empty lines are allowed, but not recommended. Best practice is to have a single block of directives at the beginning of the file.
- Advanced usage
- The job script may be written in any scripting language, such as Perl or Python. The interpreter is specified in the first script line in Unix hash-bang syntax
#!/usr/bin/perl
, or using the qsub -S path_list option. - The default directive token
#PBS
can be changed or unset entirely with the qsub -C option; see qsub, sec. Extended Description.
Example job file
Here are a few example jobs for the most common tasks. Note how the PBS directives are mostly independent from the type of job, except for the node specification.
OpenMPI, InfiniBand
This is the default user environment. The openmpi
and icc, ifort, mkl
modules are preloaded in the system's shell startup files.
The InfiniBand fast interconnect is selected in the openmpi
module by means of the environment variable $OMPI_MCA_btl
.
#!/bin/bash
#
# Basics: Number of nodes, processors per node (ppn), and walltime (hhh:mm:ss)
#PBS -l nodes=5:ppn=8
#PBS -l walltime=0:10:00
#PBS -N job_name
#PBS -A account
#
# File names for stdout and stderr. If not set here, the defaults
# are <JOBNAME>.o<JOBNUM> and <JOBNAME>.e<JOBNUM>
#PBS -o job.out
#PBS -e job.err
#
# Send mail at begin, end, abort, or never (b, e, a, n). Default is "a".
#PBS -m ea
# change into the directory where qsub will be executed
cd $PBS_O_WORKDIR
# count allocated cores
nprocs=$(wc -l < $PBS_NODEFILE)
# start MPI job over default interconnect
mpirun -machinefile $PBS_NODEFILE -np $nprocs \
programname
- If your program reads from files or takes options and/or arguments, use and adjust one of the following forms:
mpirun -machinefile $PBS_NODEFILE -np $nprocs \
programname < run.in
mpirun -machinefile $PBS_NODEFILE -np $nprocs \
programname -options arguments < run.in
mpirun -machinefile $PBS_NODEFILE -np $nprocs \
programname < run.in > run.out 2> run.err
- In the last form, anything after
programname
is optional. If you use specific redirections for stdout or stderr as shown (>, 2>), the job-global filesjob.out, job.err
declared earlier will remain empty or only contain output from your shell startup files (which should really be silent), and the rest of your job script.
OpenMPI, Ethernet
To select ethernet transport, such as for embarrasingly parallel jobs, specify an -mca
option:
mpirun -machinefile $PBS_NODEFILE -np $nprocs \
-mca btl self,tcp \
programname
Intel-MPI
Under Intel-MPI the job script will be:
#!/bin/bash
#PBS ... (same as above)
cd $PBS_O_WORKDIR
mpiexec.hydra \
-machinefile $PBS_NODEFILE \
-np $(wc -l < $PBS_NODEFILE) \
programname
Other scripting languages
The job script is usually written in a Unix command shell language, typically bash. However, any scripting language can be used. The only pieces that Torque reads are the Torque directives, up to the first actual command, i.e., a line which is (a) not empty, (b) not a directive, and (c) not a "#"-style comment.
The script interpreter is chosen by the Linux kernel from the first line of the script. For instance, a job script for the Python language would typically begin like this:
#!/usr/bin/python
# Sample PBS job script, using python directly.
#
# Basics: Number of nodes, processors per node (ppn), and walltime [dd:]hh:mm:ss
#PBS -l nodes=3:ppn=8
#PBS -l walltime=0:10:00
#PBS -N job_name
#PBS -A account
#
# File names for stdout and stderr. If not set here, the defaults
# are <JOBNAME>.o<JOBNUM> and <JOBNAME>.e<JOBNUM>
#PBS -o job.out
#PBS -e job.err
#
# Send mail at begin, end, abort, or never (b, e, a, n). Default is "a".
#PBS -m ea
import os
os.chdir(os.environ['PBS_O_WORKDIR'])
…
The account parameter
The parameter for option -A account
can take the following forms:
cnm23456
- for most jobs, containing your 5-digit proposal number.
user
- (the actual string "user", not your user name) for a limited personal startup allocation.
staff
- for discretionary access by CNM staff.
You can check your account balance in hours as follows:
mybalance -h
gbalance -u $USER -h
Advanced node selection
You can refine the node selection (normally done via the PBS resource -l nodes=…
) to
finely control your node and core allocations. You may need to do so for the following reasons:
- select specific node hardware generations,
- permit shared vs. exclusive node access,
- vary PPN across the nodes of a job,
- accommodate multithreading (OpenMP).
See HPC/Submitting Jobs/Advanced node selection for these topics.