HPC/Hardware Details

From CNM Wiki
< HPC
Jump to navigation Jump to search


HPC-Main-external.jpg

Carbon Cluster
User Information


User nodes

Carbon has several major hardware node types, named genX for short, with X = 1, 2. …. Nodes are further sub-classified by their amount of installed memory.

Node Types

Node
names, types
Node
generation
Node
extra
properties
Node
count
Cores
per node
(max. ppn)
Cores total,
by type
Account
charge
rate
CPU
model
CPUs
per node
CPU
nominal
clock
(GHz)
Mem.
per node
(GB)
Mem.
per core
(GB)
GPU
model
GPU
per node
VRAM
per GPU
(GB)
Disk
per node
(GB)
Year
added
Note
Login
login5…6 gen7a gpus=2 2 16 32 3.0 Xeon Silver 4125 2 2.50 192 12 Tesla V100 2 32 250 2019
Compute
n421…460 gen5 40 16 640 2.0 Xeon E5-2650 v4 2 2.10 128 8 250 2017
n461…476 gen6 16 16 256 2.0 Xeon Silver 4110 2 2.10 96 6 1000 2018
n477…512 gen6 36 16 576 2.0 Xeon Silver 4110 2 2.10 192 12 1000 2018
n513…534 gen7 gpus=2 22 32 704 3.0 Xeon Gold 6226R 2 2.90 192 6 Tesla V100S 2 32 250 2020
n541…580 gen8 20 64 2560 2.1 Xeon Gold 6430 2 2.10 1024 16 420 2024
Total 134 4736 48
  • Compute time is charged as the product of cores reserved × wallclock time × charge rate. The charge rate accommodates nominal differences in CPU speed. The reference speed (100%) is taken at a gen2 core.
  • Compute time on gen4 nodes is charged at 200% of actual walltime, given their performance relative to gen2 nodes.
  • gen7 nodes have two GPUs each; GPU usage is currently not "charged" (accounted for) separately.
  • Virtual memory usage on nodes may reach up to about 2 × the physical memory size. Your processes running under PBS may allocate that much vmem but cannot practically use it all for reasons of swap space size and bandwidth. If a node acitvely uses swap for more than a few minutes (which drastically slows down compute performance), the job will automatically be killed.

Major CPU flags

CPU capabilities grow with each node generation. Executables can be compiled to leverage specific CPU capabilities. Jobs using such executables must use the qsub option -l nodes=...:genX to be directed to nodes having that capability.

Major CPU capability flags by node generation. For details, see: CPUID instruction in Wikipedia, a StackExchange article, or /usr/src/kernels/*/arch/x86/include/asm/cpufeatures.h in kernel sources.
Flag name gen5 gen6 gen7 gen8
cat_l2 cdp_l2 cldemote gfni movdir64b movdiri pconfig sha_ni umip vaes vpclmulqdq x
avx512_bitalg x
avx512_vbmi2 x
avx512_vpopcntdq x
avx512ifma x
avx512vbmi x
avx512_vnni x x
mpx x x
avx512bw x x x
avx512cd x x x
avx512dq x x x
avx512f x x x
avx512vl x x x
art clwb flush_l1d ibpb mba md_clear ospke pku ssbd stibp tsc_deadline_timer xgetbv1 xsavec x x x
3dnowprefetch abm acpi aes aperfmperf apic arat arch_perfmon bmi1 bmi2 bts cat_l3 cdp_l3 cmov constant_tsc cqm cqm_llc cqm_mbm_local cqm_mbm_total cqm_occup_llc cx16 cx8 dca de ds_cpl dtes64 dtherm dts eagerfpu epb ept erms est f16c flexpriority fpu fsgsbase fxsr hle ht ida invpcid invpcid_single lahf_lm lm mca mce mmx monitor movbe msr mtrr nonstop_tsc nopl nx pae pat pbe pcid pclmulqdq pdcm pdpe1gb pebs pge pln pni popcnt pse pse36 pts rdrand rdseed rdt_a rdtscp rep_good rsb_ctxsw rtm sdbg sep smap smep smx ss sse sse2 sse4_1 ssse3 syscall tm tm2 tpr_shadow tsc tsc_adjust vme vmx vnmi vpid x2apic xsave xsaveopt xtopology xtpr x x x x
avx x x x x
avx2 x x x x
fma x x x x
adx x x x x
sse4_2 x x x x


Storage

  • Lustre parallel file system for /home and /sandbox
  • ≈600 TB total
  • local disk per compute node, 160–250 GB
HPC Infiniband-blue.png

Interconnect

  • Infiniband – used for parallel communication and storage
  • Gigabit Ethernet – used for general node access and management

Power

  • Power consumption at typical load: ≈125 kW