HPC/Applications/lammps

From CNM Wiki
< HPC‎ | Applications
Revision as of 21:38, January 23, 2012 by Stern (talk | contribs)
Jump to navigation Jump to search

Benchmark

Using a sample workload from Sanket ("run9"), I tested various OpenMPI options on both node types.

LAMMPS performs best on gen2 nodes without extra opiotns, and pretty well on gen1 nodes over ethernet(!).

Job Type Interconnect Additional OpenMPI options Relative speed
(1000 steps/3 hours)
Notes
gen1 gen1 IB (none) 36
gen1srq gen1 IB (none) 39 (to be confirmed) not useful (IB and eth. options)
gen1srqpin gen1 IB -mca btl_openib_use_srq 1
-mca mpi_paffinity_alone 1
39
gen1srqeth gen1 Ethernet -mca btl_openib_use_srq 1
-mca btl self,tcp
42 not useful (IB and eth. options)
gen1eth gen1 Ethernet -mca btl self,tcp 44 fastest for gen1
gen2eth gen2 Ethernet -mca btl self,tcp 49
gen2srq gen2 IB -mca btl_openib_use_srq 1 59
gen2 gen2 IB (none) 59 fastest for gen2

Sample job file gen1

#!/bin/bash
#PBS -l nodes=20:ppn=8:gen1
#PBS -l walltime=3:00:00
#PBS -N <jobname>
#PBS -A <account>
#
#PBS -o job.out
#PBS -e job.err
#PBS -m ea

# change into the directory where qsub will be executed
cd $PBS_O_WORKDIR

mpirun  -machinefile  $PBS_NODEFILE \
        -np $(wc -l < $PBS_NODEFILE) \
        -mca btl self,tcp \
        lmp_openmpi < lammps.in > lammps.out 2> lammps.err

Sample job file gen2

#!/bin/bash
#PBS -l nodes=20:ppn=8:gen2
#PBS -l walltime=3:00:00
#PBS -N <jobname>
#PBS -A <account>
#
#PBS -o job.out
#PBS -e job.err
#PBS -m ea

# change into the directory where qsub will be executed
cd $PBS_O_WORKDIR

mpirun  -machinefile  $PBS_NODEFILE \
        -np $(wc -l < $PBS_NODEFILE) \
        lmp_openmpi < lammps.in > lammps.out 2> lammps.err