ARROW Cluster: Difference between revisions

From TRACC Wiki
Jump to navigation Jump to search
No edit summary
Line 14: Line 14:
* nhtsa (with 12 nodes, each with 28 cores and 64 GB of RAM, only available to the NHTSA project)
* nhtsa (with 12 nodes, each with 28 cores and 64 GB of RAM, only available to the NHTSA project)
* arrow  
* arrow  
** 3 EPYC 7702P nodes each with 16 logical CPUs and 4 cores each
** 3 EPYC 7702P nodes each with 16 logical CPUs, 4 cores each, and 512 GB of RAM (node names a001, a002, and a003)
** Each EPYC node has 512GB of RAM
** 2 EPYC 7702P nodes each with 16 logical CPUs, 4 cores each, and 256 GB of RAM (node names a004 and a005)
** The nodes are currently for use for testing by TRACC staff or special permission by the TRACC Director
** The nodes are currently for use for testing by TRACC staff or special permission by the TRACC Director
* virtual (This queue is only available for testing and is considered under construction. Please do not use for now.)
* virtual (This queue is only available for testing and is considered under construction. Please do not use for now.)
* test (This queue is only available for testing, and is only available with permission by the TRACC Director. The nodes as currently configure are  
* test (This queue is only available for testing, and is only available with permission by the TRACC Director. The nodes as currently configure are  
not very powerful but have large amounts of RAM.)
not very powerful but have large amounts of RAM.)

Revision as of 20:14, May 4, 2021

Introduction To ARROW

TRACC has now combined the hardware from the Phoenix and Zephyr clusters into the ARROW cluster. This consolidation allows efficient administration of TRACC cluster services with limited staff. To avoid the problems of load balancing, the different types of hardware nodes on the ARROW cluster are partitioned and available in queues. When new hardware is installed to expand cluster resources, it will be made available via a new queue. The documentation at Using the Clusters describes procedures for using ARROW.

ARROW is arranged such that there is a single set of login nodes, a singe file system, and single user home directory that serves all of the nodes in all of the queues.

ARROW Queues

There are currently several queues that are available, some with restrictions about who can use them as described below.

  • batch (default queue, with 90 nodes, each node with 16 floating point cores available for general use)
    • 92 nodes each with 32 GB of RAM
  • batch128
    • 2 nodes (n001 and n002) each with 128GB
    • Available for general usage
  • batch65
    • 2 nodes (n003 and n004) each with 64GB
    • Available for general usage
  • nhtsa (with 12 nodes, each with 28 cores and 64 GB of RAM, only available to the NHTSA project)
  • arrow
    • 3 EPYC 7702P nodes each with 16 logical CPUs, 4 cores each, and 512 GB of RAM (node names a001, a002, and a003)
    • 2 EPYC 7702P nodes each with 16 logical CPUs, 4 cores each, and 256 GB of RAM (node names a004 and a005)
    • The nodes are currently for use for testing by TRACC staff or special permission by the TRACC Director
  • virtual (This queue is only available for testing and is considered under construction. Please do not use for now.)
  • test (This queue is only available for testing, and is only available with permission by the TRACC Director. The nodes as currently configure are

not very powerful but have large amounts of RAM.)