HPC/Applications/lammps
< HPC | Applications
Jump to navigation
Jump to search
Benchmark
Using a sample workload from Sanket ("run9"), I tested various OpenMPI options on both node types.
LAMMPS performs best on gen2 nodes without extra opiotns, and pretty well on gen1 nodes over ethernet(!).
Job | Type | Interconnect | Additional OpenMPI options | Relative speed (1000 steps/3 hours) |
Notes |
---|---|---|---|---|---|
gen1 | gen1 | IB | (none) | 36 | |
gen1srq | gen1 | IB | (none) | 39 (to be confirmed) | not useful (IB and eth. options) |
gen1srqpin | gen1 | IB | -mca btl_openib_use_srq 1 -mca mpi_paffinity_alone 1 |
39 | |
gen1srqeth | gen1 | Ethernet | -mca btl_openib_use_srq 1 -mca btl self,tcp |
42 | not useful (IB and eth. options) |
gen1eth | gen1 | Ethernet | -mca btl self,tcp | 44 | fastest for gen1 |
gen2eth | gen2 | Ethernet | -mca btl self,tcp | 49 | |
gen2srq | gen2 | IB | -mca btl_openib_use_srq 1 | 59 | |
gen2 | gen2 | IB | (none) | 59 | fastest for gen2 |