HPC/Benchmarks/Generation 1 vs 2

Introduction

Earlier this year, we received 200 additional nodes with E5540 processors. The processors have 8 cores, and support Hyperthreading, a feature which allows 2 threads per core. This benchmark investigates the benefit of hyperthreading (HT), and suggests optimal values for the nodes and processors per node (ppn) parameters in PBS.

Data

HPC/Generation-2 nodes/vasp/vasp.lst, grep-able
media:Vasp.txt, tab-separated CSV
media:Vasp.pdf, PDF

Observations

4-core runs give a high numerical throughput in each node type (run=01 to 04)
gen2 nodes are fine for VASP with nodes=1:ppn=8; gen1 nodes are not (run=22 vs. 21)
Adding more nodes allows for the fastest run (run=54) or 40% slower and a better charge rate (run=52)
Running two apps in a single job is mostly not worth the effort of managing them (run=04 vs. 22)
HT allows for slightly better charge rates, but usually only with non-MPI jobs (or unsynced MPI jobs) (run=15, 25, 40, 55), and runtimes are nearly proportionately longer, making HT largely unattractive. This also holds for the only case tested for HT and napps=1 (run=50).

Recommendations

For the given workload, the I recommend the following values for optimal performance with respect to the given objective.

Node type	Objective
	time → min	charge → min	time × charge → min
gen1	`nodes=4:ppn=3`	`nodes=1:ppn=4`	`nodes=3:ppn=4`
	run=35 tmax=503.05 charge=4.47	run=01 tmax=1138.83 charge=2.53	run=33 tmax=544.74 charge=3.63
gen2	`nodes=4:ppn=4`	`nodes=1:ppn=8`	`nodes=2:ppn=8`
	run=54 tmax=237.59 charge=2.11	run=22 tmax=472.16 charge=1.05	run=52 tmax=329.10 charge=1.46

HPC/Benchmarks/Generation 1 vs 2

Contents

Introduction

Data

Observations

Recommendations

Navigation menu

HPC/Benchmarks/Generation 1 vs 2

Introduction

Data

Observations

Recommendations

Navigation menu

Search