ARROW Cluster Under Construction: Difference between revisions

From TRACC Wiki
Jump to navigation Jump to search
No edit summary
Line 5: Line 5:
===Torque===
===Torque===


batch
* batch
nhtsa
**nodes
virtual
* nhtsa
arrow
** Type
gpu
* virtual
extra
**
lambda
* arrow
epyc3
**
* gpu
**
* extra
**
* lambda
**
* epyc3
**
 


==PBS==
==PBS==


workq
* workq
xeon28
**
virtual
 
a4000
* xeon28
a6000
**
 
* virtual
**
 
* a4000
**


* a6000
**


* batch queue (default queue)
* batch queue (default queue)

Revision as of 23:01, January 30, 2025

Arrow Cluster Computing Queues

There are two schedulers in Arrow: Torque and PBS The queues in each are shown below. There are currently several queues that are available, some with restrictions about who can use them as described below. Also be aware that all nodes in some queues have the same characteristics (RAM, etc) while some queues have nodes with different characteristics. Thus jobs using those queues must specify the node names that are to be used.

Torque

  • batch
    • nodes
  • nhtsa
    • Type
  • virtual
  • arrow
  • gpu
  • extra
  • lambda
  • epyc3


PBS

  • workq
  • xeon28
  • virtual
  • a4000
  • a6000
  • batch queue (default queue)
    • 95 nodes numbered n005 through n099
    • 2 x AMD Opteron 6276
    • 16 floating point cores per node
    • 32GB of RAM per node
    • available for general use
  • batch128 queue
    • 2 nodes numbered n001 and n002
    • Same design as batch queue
    • 128GB of RAM per node
    • available for general use
  • batch64 queue
    • 2 nodes numbered n003 and n004
    • Same design as batch queue
    • 64GB of RAM per node
    • available for general use
  • nhtsa queue
    • 12 nodes numbered p001 through p012
    • 2 x Intel Xeon E5-2690 v4
    • 28 floating point cores per node
    • 64GB of RAM per node
    • only available to the NHTSA project
  • arrow queue
    • 15 nodes numbered a001 through a015
    • 1 x Intel EPYC 7702P
    • 64 floating point cores per node
    • 256GB of RAM per node, 512GB on nodes a001 through a003
    • available for general use
  • extra queue
    • 12 nodes numbered a0016 through a027
    • 1 x Intel EPYC 7713P
    • 64 floating point cores per node
    • 256GB of RAM per node, 512GB on nodes a018 through a022
    • available for general use
    • note: this queue will likely be merged into the arrow queue in the future
  • virtual queue
    • 5 nodes numbered v001 through v005
    • Mostly for internal testing and validation, can be used as 2 core machines with 32GB memory
    • Minimal virtual hardware, not capable of running engineering applications