HPC/Hardware Details: Difference between revisions

Revision as of 20:49, December 1, 2014

Carbon Cluster
User Information

User nodes

Carbon has several major hardware node types, named genX for short, with X = 1, 2. ….
Node characteristics

Node names, types	Node generation	Node extra properties	Node count	Cores per node (max. `ppn`)	Cores total, by type	Account charge rate	CPU model	CPUs per node	CPU nominal clock (GHz)	Mem. per node (GB)	Mem. per core (GB)	GPU model	GPU per node	VRAM per GPU (GB)	Disk per node (GB)	Year added
Login
login5…6	gen7a	gpus=2	2	16	32	1.0	Xeon Silver 4125	2	2.50	192	12	Tesla V100	2	32	250	2019
Compute
n421…460	gen5		40	16	640	1.0	Xeon E5-2650 v4	2	2.10	128	8				250	2017
n461…476	gen6		16	16	256	1.0	Xeon Silver 4110	2	2.10	96	6				1000	2018
n477…512	gen6		36	16	576	1.0	Xeon Silver 4110	2	2.10	192	12				1000	2018
n513…534	gen7	gpus=2	22	32	704	1.5	Xeon Gold 6226R	2	2.90	192	6	Tesla V100S	2	32	250	2020
n541…580	gen8		20	64	2560	1.0	Xeon Gold 6430	2	2.10	1024	16				420	2024
Total			134		4736								48

Compute time on gen1 nodes is charged at 40% of actual walltime. Depending on cores used and memory throughput demanded, these nodes may actually be about on par with gen2 (low memory throughput) or up to about 2–3 times slower.
Compute time on gen4 nodes is charged at 200% of actual walltime, given their performance relative to gen2 nodes.
gen3 nodes have one Tesla C2075 GPU each - GPU usage it is not charged separately.
Virtual memory usage on nodes may reach up to about 2 × the physical memory size. Your processes running under PBS may allocate that much vmem but cannot practically use it all for reasons of swap space size and bandwidth. If a node acitvely uses swap for more than a few minutes (which drastically slows down compute performance), the job will be killed.

Storage

Lustre parallel file system for /home and /sandbox
≈100 TB total
local disk per compute node, 160–250 GB

Interconnect

Infiniband DDR/QDR (for gen3) – used for parallel communication and storage
Ethernet 1 GB/s – used for general node access and management

Power

Power consumption at typical load: ≈125 kW

@@ Line 3: / Line 3: @@
 == User nodes ==
-[[Image:HPC Compute Node Chassis.jpg|thumb|right|200px|[http://www.supermicro.com/products/nfo/1UTwin.cfm 1U Twin] node chassis ([http://www.supermicro.com/products/chassis/1U/808/SC808T-980V.cfm Supermicro]).]]
+<!-- [[Image:HPC Compute Node Chassis.jpg|thumb|right|200px|[http://www.supermicro.com/products/nfo/1UTwin.cfm 1U Twin] node chassis ([http://www.supermicro.com/products/chassis/1U/808/SC808T-980V.cfm Supermicro]).]]
-[[Image:HPC Compute Rack-up.png|thumb|right|200px|]]
+[[Image:HPC Compute Rack-up.png|thumb|right|200px|]] -->
 <!-- * [http://www.supermicro.com/products/nfo/1UTwin.cfm "1U Twin"] by [http://www.supermicro.com/ Supermicro] -->
-* Carbon has several major hardware node types, named '''gen1''' through '''gen3'''.
+* Carbon has several major hardware node types, named '''gen''X''''' for short, with ''X'' = 1, 2. ….
 * Node characteristics
 {{Template:Table of node types}}
 * Compute time on gen1 nodes is charged at 40% of actual walltime. Depending on cores used and memory throughput demanded, these nodes may actually be about ''on par'' with gen2 (low memory throughput) or up to about 2–3 times slower.
+* Compute time on gen4 nodes is charged at 200% of actual walltime, given their performance relative to gen2 nodes.
 * gen3 nodes have one Tesla C2075 GPU each - GPU usage it is not charged separately.
 * '''Virtual memory usage''' on nodes may reach up to about 2 × the physical memory size. Your processes running under PBS may allocate that much vmem but cannot practically use it all for reasons of swap space size and bandwidth. If a node acitvely uses swap for more than a few minutes (which drastically slows down compute performance), the job will be killed.

HPC/Hardware Details: Difference between revisions

Revision as of 20:49, December 1, 2014

User nodes

Storage

Interconnect

Power

Navigation menu

Search