HPC/Benchmarks/Generation 1 vs 2: Difference between revisions

From CNM Wiki
< HPC
Jump to navigation Jump to search
(→‎Data: added files)
Line 1: Line 1:
== Introduction ==
== Introduction ==
Earlier this year, we received 200 additional nodes with E5540 processors.  The processors have 8 cores, and support Hyperthreading, a feature which allows 2 threads per core.  This benchmark investigates the benefit of hyperthreading (HT), and suggests optimal values for the ''nodes'' and ''processors per node (ppn)'' parameters in PBS.
Earlier this year, we received 200 additional nodes with E5540 processors.  The processors have 8 cores, and support Hyperthreading, a feature which allows 2 threads per core.  This benchmark investigates the benefit of hyperthreading (HT), and suggests optimal values for the ''nodes'' and ''processors per node (ppn)'' parameters in PBS.
== Data ==
* [[HPC/Generation-2 nodes/vasp/vasp.lst]], grep-able
* [[media:HPC/Generation-2 nodes/vasp/vasp.txt]], tab-separated CSV
* [[media:HPC/Generation-2 nodes/vasp/vasp.pdf]], PDF


== Observations and conclusions ==
== Observations and conclusions ==

Revision as of 21:40, March 14, 2010

Introduction

Earlier this year, we received 200 additional nodes with E5540 processors. The processors have 8 cores, and support Hyperthreading, a feature which allows 2 threads per core. This benchmark investigates the benefit of hyperthreading (HT), and suggests optimal values for the nodes and processors per node (ppn) parameters in PBS.

Data

Observations and conclusions

  • 4-core runs give a high numerical throughput in each node type (run=01 to 04)
  • gen2 nodes are fine for VASP with nodes=1:ppn=8; gen1 nodes are not (run=22 vs. 21)
  • Adding more nodes allows for the fastest run (run=54) or 40% slower and a better charge rate (run=52)
  • Running two apps in a single job is mostly not worth the effort of managing them (run=04 vs. 22)
  • HT allows for slightly better charge rates, but usually only with non-MPI jobs (or unsynced MPI jobs) (run=15, 25, 40, 55), and runtimes are nearly proportionately longer, making HT largely unattractive. This also holds for the only case tested for HT and napps=1 (run=50).