Template:Section node types: Difference between revisions
Jump to navigation
Jump to search
m (→CPU flags) |
m (→Node Types) |
||
Line 8: | Line 8: | ||
* Compute time is charged as the product of ''cores reserved'' × ''wallclock time'' × ''charge rate''. The charge rate accommodates nominal differences in CPU speed. The reference speed (100%) is taken at a gen2 core. | * Compute time is charged as the product of ''cores reserved'' × ''wallclock time'' × ''charge rate''. The charge rate accommodates nominal differences in CPU speed. The reference speed (100%) is taken at a gen2 core. | ||
* Compute time on gen4 nodes is charged at 200% of actual walltime, given their performance relative to gen2 nodes. | * Compute time on gen4 nodes is charged at 200% of actual walltime, given their performance relative to gen2 nodes. | ||
* | * gen7 and gen2 nodes have ''two'' and one GPU each, respectively; GPU usage is currently not "charged" (accounted for) separately. | ||
* '''Virtual memory usage''' on nodes may reach up to about 2 × the physical memory size. Your processes running under PBS may allocate that much vmem but cannot practically use it all for reasons of swap space size and bandwidth. If a node acitvely uses swap for more than a few minutes (which drastically slows down compute performance), the job will automatically be killed. | * '''Virtual memory usage''' on nodes may reach up to about 2 × the physical memory size. Your processes running under PBS may allocate that much vmem but cannot practically use it all for reasons of swap space size and bandwidth. If a node acitvely uses swap for more than a few minutes (which drastically slows down compute performance), the job will automatically be killed. | ||
Revision as of 19:37, April 20, 2020
Carbon has several major hardware node types, named genX for short, with X = 1, 2. …. Nodes are further sub-classified by their amount of installed memory.
Node Types
Node names, types |
Node generation |
Node extra properties |
Node count |
Cores per node (max. ppn )
|
Cores total, by type |
Account charge rate |
CPU model |
CPUs per node |
CPU nominal clock (GHz) |
Mem. per node (GB) |
Mem. per core (GB) |
GPU model |
GPU per node |
VRAM per GPU (GB) |
Disk per node (GB) |
Year added |
Note |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Login | |||||||||||||||||
login5…6 | gen7a | gpus=2 | 2 | 16 | 32 | 1.0 | Xeon Silver 4125 | 2 | 2.50 | 192 | 12 | Tesla V100 | 2 | 32 | 250 | 2019 | |
Compute | |||||||||||||||||
n421…460 | gen5 | 40 | 16 | 640 | 1.0 | Xeon E5-2650 v4 | 2 | 2.10 | 128 | 8 | 250 | 2017 | |||||
n461…476 | gen6 | 16 | 16 | 256 | 1.0 | Xeon Silver 4110 | 2 | 2.10 | 96 | 6 | 1000 | 2018 | |||||
n477…512 | gen6 | 36 | 16 | 576 | 1.0 | Xeon Silver 4110 | 2 | 2.10 | 192 | 12 | 1000 | 2018 | |||||
n513…534 | gen7 | gpus=2 | 22 | 32 | 704 | 1.5 | Xeon Gold 6226R | 2 | 2.90 | 192 | 6 | Tesla V100S | 2 | 32 | 250 | 2020 | |
n541…580 | gen8 | 20 | 64 | 2560 | 1.0 | Xeon Gold 6430 | 2 | 2.10 | 1024 | 16 | 420 | 2024 | |||||
Total | 134 | 4736 | 48 |
- Compute time is charged as the product of cores reserved × wallclock time × charge rate. The charge rate accommodates nominal differences in CPU speed. The reference speed (100%) is taken at a gen2 core.
- Compute time on gen4 nodes is charged at 200% of actual walltime, given their performance relative to gen2 nodes.
- gen7 and gen2 nodes have two and one GPU each, respectively; GPU usage is currently not "charged" (accounted for) separately.
- Virtual memory usage on nodes may reach up to about 2 × the physical memory size. Your processes running under PBS may allocate that much vmem but cannot practically use it all for reasons of swap space size and bandwidth. If a node acitvely uses swap for more than a few minutes (which drastically slows down compute performance), the job will automatically be killed.
CPU flags
The CPU capabilities grow with each node generation. Executables can be compiled to leverage specific CPU capabilities. Jobs using such executables must use the qsub option -l nodes=...:genX
to be directed to nodes having that capability.
Here is a list of capability flags by node generation, obtained from /proc/cpuinfo
.
For the meaning of the flags, see articles at StackExchange and Wikipedia.
Flag name | gen5 | gen6 | gen7 | gen8 |
---|---|---|---|---|
cat_l2 cdp_l2 cldemote gfni movdir64b movdiri pconfig sha_ni umip vaes vpclmulqdq | – | – | – | x |
avx512_bitalg | – | – | – | x |
avx512_vbmi2 | – | – | – | x |
avx512_vpopcntdq | – | – | – | x |
avx512ifma | – | – | – | x |
avx512vbmi | – | – | – | x |
avx512_vnni | – | – | x | x |
mpx | – | x | x | – |
avx512bw | – | x | x | x |
avx512cd | – | x | x | x |
avx512dq | – | x | x | x |
avx512f | – | x | x | x |
avx512vl | – | x | x | x |
art clwb flush_l1d ibpb mba md_clear ospke pku ssbd stibp tsc_deadline_timer xgetbv1 xsavec | – | x | x | x |
3dnowprefetch abm acpi aes aperfmperf apic arat arch_perfmon bmi1 bmi2 bts cat_l3 cdp_l3 cmov constant_tsc cqm cqm_llc cqm_mbm_local cqm_mbm_total cqm_occup_llc cx16 cx8 dca de ds_cpl dtes64 dtherm dts eagerfpu epb ept erms est f16c flexpriority fpu fsgsbase fxsr hle ht ida invpcid invpcid_single lahf_lm lm mca mce mmx monitor movbe msr mtrr nonstop_tsc nopl nx pae pat pbe pcid pclmulqdq pdcm pdpe1gb pebs pge pln pni popcnt pse pse36 pts rdrand rdseed rdt_a rdtscp rep_good rsb_ctxsw rtm sdbg sep smap smep smx ss sse sse2 sse4_1 ssse3 syscall tm tm2 tpr_shadow tsc tsc_adjust vme vmx vnmi vpid x2apic xsave xsaveopt xtopology xtpr | x | x | x | x |
avx | x | x | x | x |
avx2 | x | x | x | x |
fma | x | x | x | x |
adx | x | x | x | x |
sse4_2 | x | x | x | x |