HPC/Hardware Details: Difference between revisions
< HPC
Jump to navigation
Jump to search
m (→User nodes) |
m (→User nodes) |
||
Line 3: | Line 3: | ||
== User nodes == | == User nodes == | ||
[[Image:HPC Compute Node Chassis.jpg|thumb|right|200px|[http://www.supermicro.com/products/nfo/1UTwin.cfm 1U Twin] node chassis ([http://www.supermicro.com/products/chassis/1U/808/SC808T-980V.cfm Supermicro]).]] | <!-- [[Image:HPC Compute Node Chassis.jpg|thumb|right|200px|[http://www.supermicro.com/products/nfo/1UTwin.cfm 1U Twin] node chassis ([http://www.supermicro.com/products/chassis/1U/808/SC808T-980V.cfm Supermicro]).]] | ||
[[Image:HPC Compute Rack-up.png|thumb|right|200px|]] | [[Image:HPC Compute Rack-up.png|thumb|right|200px|]] --> | ||
<!-- * [http://www.supermicro.com/products/nfo/1UTwin.cfm "1U Twin"] by [http://www.supermicro.com/ Supermicro] --> | <!-- * [http://www.supermicro.com/products/nfo/1UTwin.cfm "1U Twin"] by [http://www.supermicro.com/ Supermicro] --> | ||
* Carbon has several major hardware node types, named ''' | * Carbon has several major hardware node types, named '''gen''X''''' for short, with ''X'' = 1, 2. …. | ||
* Node characteristics | * Node characteristics | ||
{{Template:Table of node types}} | {{Template:Table of node types}} | ||
* Compute time on gen1 nodes is charged at 40% of actual walltime. Depending on cores used and memory throughput demanded, these nodes may actually be about ''on par'' with gen2 (low memory throughput) or up to about 2–3 times slower. | * Compute time on gen1 nodes is charged at 40% of actual walltime. Depending on cores used and memory throughput demanded, these nodes may actually be about ''on par'' with gen2 (low memory throughput) or up to about 2–3 times slower. | ||
* Compute time on gen4 nodes is charged at 200% of actual walltime, given their performance relative to gen2 nodes. | |||
* gen3 nodes have one Tesla C2075 GPU each - GPU usage it is not charged separately. | * gen3 nodes have one Tesla C2075 GPU each - GPU usage it is not charged separately. | ||
* '''Virtual memory usage''' on nodes may reach up to about 2 × the physical memory size. Your processes running under PBS may allocate that much vmem but cannot practically use it all for reasons of swap space size and bandwidth. If a node acitvely uses swap for more than a few minutes (which drastically slows down compute performance), the job will be killed. | * '''Virtual memory usage''' on nodes may reach up to about 2 × the physical memory size. Your processes running under PBS may allocate that much vmem but cannot practically use it all for reasons of swap space size and bandwidth. If a node acitvely uses swap for more than a few minutes (which drastically slows down compute performance), the job will be killed. |
Revision as of 20:49, December 1, 2014
|
User nodes
- Carbon has several major hardware node types, named genX for short, with X = 1, 2. ….
- Node characteristics
Node names, types |
Node generation |
Node extra properties |
Node count |
Cores per node (max. ppn )
|
Cores total, by type |
Account charge rate |
CPU model |
CPUs per node |
CPU nominal clock (GHz) |
Mem. per node (GB) |
Mem. per core (GB) |
GPU model |
GPU per node |
VRAM per GPU (GB) |
Disk per node (GB) |
Year added |
Note |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Login | |||||||||||||||||
login5…6 | gen7a | gpus=2 | 2 | 16 | 32 | 1.0 | Xeon Silver 4125 | 2 | 2.50 | 192 | 12 | Tesla V100 | 2 | 32 | 250 | 2019 | |
Compute | |||||||||||||||||
n421…460 | gen5 | 40 | 16 | 640 | 1.0 | Xeon E5-2650 v4 | 2 | 2.10 | 128 | 8 | 250 | 2017 | |||||
n461…476 | gen6 | 16 | 16 | 256 | 1.0 | Xeon Silver 4110 | 2 | 2.10 | 96 | 6 | 1000 | 2018 | |||||
n477…512 | gen6 | 36 | 16 | 576 | 1.0 | Xeon Silver 4110 | 2 | 2.10 | 192 | 12 | 1000 | 2018 | |||||
n513…534 | gen7 | gpus=2 | 22 | 32 | 704 | 1.5 | Xeon Gold 6226R | 2 | 2.90 | 192 | 6 | Tesla V100S | 2 | 32 | 250 | 2020 | |
n541…580 | gen8 | 20 | 64 | 2560 | 1.0 | Xeon Gold 6430 | 2 | 2.10 | 1024 | 16 | 420 | 2024 | |||||
Total | 134 | 4736 | 48 |
- Compute time on gen1 nodes is charged at 40% of actual walltime. Depending on cores used and memory throughput demanded, these nodes may actually be about on par with gen2 (low memory throughput) or up to about 2–3 times slower.
- Compute time on gen4 nodes is charged at 200% of actual walltime, given their performance relative to gen2 nodes.
- gen3 nodes have one Tesla C2075 GPU each - GPU usage it is not charged separately.
- Virtual memory usage on nodes may reach up to about 2 × the physical memory size. Your processes running under PBS may allocate that much vmem but cannot practically use it all for reasons of swap space size and bandwidth. If a node acitvely uses swap for more than a few minutes (which drastically slows down compute performance), the job will be killed.
Storage
- Lustre parallel file system for /home and /sandbox
- ≈100 TB total
- local disk per compute node, 160–250 GB
Interconnect
- Infiniband DDR/QDR (for gen3) – used for parallel communication and storage
- Ethernet 1 GB/s – used for general node access and management
Power
- Power consumption at typical load: ≈125 kW