HPC/Directories: Difference between revisions

From CNM Wiki
< HPC
Jump to navigation Jump to search
 
(68 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Overview ==
== Overview ==
Here is a summary of key directories related to Carbon, and environment variables used to access them:
Here is a summary of key directories related to ''Carbon'', and the environment variables used to access them:


{| class="wikitable" cellpadding="5" style="text-align:left;  margin: 1em auto 1em auto;"
{| class="wikitable" cellpadding="5" style="text-align:left;  margin: 1em auto 1em auto;"
|- style="background:#eee;"
|- style="background:#eee;"
! width="150" | Environment variable
! width="220" | Environment variable
! width="150" | Typical value
! width="180" | Typical value
! Shared across nodes
! Shared across nodes?
! Notes
! Purge schedule
! Purpose
|-
|-
| <code>$HOME</code> or <code>~</code> (tilde) || /home/joe || yes || home sweet home
| '''<code>$HOME</code>''' (same as <code>~</code> in shells)
| /home/joe
| align="center" |yes
| align="center" | See [http://www.anl.gov/cnm/user-information/user-access-program#Anchor15 CNM data retention policy]
| Your main configuration files and data.
|-
|-
| <code>$SANDBOX</code> || /sandbox/joe || yes || extra storage, ''not backed up''
| '''<code>$SANDBOX</code>'''
| /sandbox/joe
| align="center" | yes
| align="center" | [[#Global scratch space|6 weeks]]
| [https://en.wikipedia.org/wiki/Scratch_space Scratch space] for transient job data, '''not backed up'''
|-
|-
| <code>$TMPDIR</code> || /tmp/pbs_mom/12345.mds01.... || no || job-specfic scratch
| '''<code>$TMPDIR</code>'''  (on login nodes)
| /tmp
| align="center" | no
| align="center" | 6 weeks
| General Unix scratch space
|-
|-
| <code>$PBS_O_WORKDIR</code> || || (yes) || the directory qsub was run in; typically used with <code>cd $PBS_O_WORKDIR </code> as first line in a job script
| '''<code>$TMPDIR</code>'''  (during jobs)
| /tmp/12345.sched1....
| align="center" | no
| align="center" | at end of job
| Job-specific scratch space, provided empty on job start.
|-
|-
<!-- | <code>$PBS_O_INITDIR</code> || || (yes) ||  the directory a job starts to run in (normally $HOME) -->
| '''<code>$PBS_O_WORKDIR</code>'''
| (directory where <code>qsub</code> was run)
| align="center" | yes
| align="center" | (same as parent file system)
| typically used as <code>cd $PBS_O_WORKDIR </code> as first line in a job script
|-
<!-- | <code>$PBS_O_INITDIR</code>
|
| (yes)
|  the directory a job starts to run in (normally $HOME)
//-->
|}
|}


== Details by function ==
== Home directory ==
$HOME
~  (tilde character)
Your home directory can be referred to in standard Unix fashion, as shown above, by either the environment variable or the tilde sign in most shells (but generally not application programs, especially not those written in Fortran).


=== Home directory ===
* Files are backed up nightly.
$HOME
* Your total file volume in $HOME is subject to (soft) [http://en.wikipedia.org/wiki/Disk_quota quota] of generally 0.5 TB.
~
* You may exceed the soft limit by about 10% during a grace period of one week. You will see an over-quota notice upon login.
*: If your usage remains above the soft limit beyond the grace time, the file system will appear (to you) as being full. To recover, delete files.


The users' home directories are kept on a Lustre* file system and are backed up nightly. The home directory can be reached in standard Unix fashion using either the environment variable or the tilde sign in most shells (but generally not application programs, especially not those written in Fortran).
Your files in $HOME are subject to [http://www.anl.gov/cnm/user-information/user-access-program#Anchor15 '''CNM's Data Retention Policy'''],
which specifies that all your files may be deleted from our servers as early as '''30 days after your last active proposal''' has expired.
At that time, your access to ''Carbon'' and its SSH gateway will be revoked.


=== Sandbox - global scratch and overflow ===
== Global scratch space ==
  $SANDBOX
  $SANDBOX
For files that need to be shared among the nodes, and are possibly large and change often, use a "sandbox" directory.
This environment variable points to a ''user-specific'' directory, shared across nodes like the home-directory.
The environment variable points to a ''user-specific'' directory which is shared by Lustre*, but ''not backed up''.


Use this directory for short-lived files that need to be shared among multiple nodes, can get large, numerous, or change often. To accommodate this, usage policies are stricter than for /home:


=== Local scratch space ===
* Files are ''not backed up''.
* Hard quota are 3 TB in volume and 2 million in file count.
* Soft quota are 10 GB and 10,000 files.
* The grace period for overflowing a soft limit is 3 weeks.
* Files will be ''deleted automatically'' once they are older than 3 months.
 
These limits are subject to change.
 
The limits are aimed at keeping the space available for the intended use, typically for files of unusual size ([http://tvtropes.org/pmwiki/pmwiki.php/Main/RodentsOfUnusualSize F.O.U.S.]) or for small files of unusual count.
 
== Local scratch space ==
  $TMPDIR
  $TMPDIR
This variable and the directory it refers to is provided by the queueing system for all processes that execute a job.
This variable and the directory it refers to is provided by the queueing system for all processes that execute a job.
The directory:
The directory:
* resides on local disk on each node
* resides on local disk on each node,
* typically provides about 100 GB of space
* is ''named'' the same on each node,
* is the same for [MPI] processes on the SAME node (as many as given in "ppn=…")
* ''is not shared'' across nodes,
:*  (i.e., PBS job-specific, not Unix PID-specific)
* ''is shared'' for processes on the ''same'' node (as many as given in "ppn=…"), in other words, the name is PBS job-specific, but not Unix PID-specific,
* is not shared across nodes
* typically provides about 100 GB of space,
* will be wiped upon job exit
* will be wiped upon job exit on each node.
 
The environment variable TMPDIR ''is not shared'' across nodes either. Either communicate it internal to your program,
or [[HPC/Submitting and Managing Jobs/Example Job Script | have it exported by mpirun/mpiexec]]:
; OpenMPI:
mpirun … \
        '''-x TMPDIR''' \
        '''<nowiki>[-x OTHERVAR]</nowiki>''' \
        …
; Intel MPI:
mpiexec.hydra … \
        '''-genvlist TMPDIR'''<nowiki>[,OTHERVAR]</nowiki> \
        …


==== Use in job files ====
=== Use in job files ===
You can use <code>$TMPDIR</code> in one or more of the following ways:
You can use <code>$TMPDIR</code> in one or more of the following ways:


Line 55: Line 110:
* copy files back upon job termination:
* copy files back upon job termination:
  #PBS -W stageout=<font color="red">$TMPDIR/'''foo.ext'''</font>@<font color="green">localhost:$PBS_O_WORKDIR</font>
  #PBS -W stageout=<font color="red">$TMPDIR/'''foo.ext'''</font>@<font color="green">localhost:$PBS_O_WORKDIR</font>
  #PBS -W stageout=<font color="red">$TMPDIR/'''*.bar'''</font>@<font color="green">localhost:$PBS_O_WORKDIR</font>
  #PBS -W stageout=<font color="red">$TMPDIR/'''*.bar'''</font>@<font color="green">localhost:$PBS_O_WORKDIR/''subdir''</font>
: You may specify several of these lines and use wildcards in to specify source <font color="red">files on the compute nodes</font>. In contrast to explicit trailing "cp" commands in the job script, this copy will be executed ''even if a job overruns its walltime''.  See the [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml#W qsub manual] for further information.
* The <font color="red">string on the left</font> of the <code>@</code> character names the ''source'' files in the compute node file system.
* The <font color="green">string on the right</font> gives the ''destination'' host and directory, also as seen from the compute node. This means <font color="green">localhost</font> refers to the primary compute node (rank 0 in MPI parlance). <font color="green">$PBS_O_WORKDIR</font> by default stems from what was qsub's current directory on the ''submission'' node, but ''Carbon's'' user file systems are mounted on login nodes and compute nodes under the same paths.
* You may give the <code>stageout=…</code> directive multiple times, as shown.
: In contrast to explicit trailing "cp" commands in the job script, this copy will be executed ''even if a job overruns its walltime''.  See the [http://www.clusterresources.com/torquedocs21/commands/qsub.shtml#W qsub manual] for further information.
 
<!--
== Local RAM disk ==
(2020-05-21: to be added)
[https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html XDG Base Directory Specification]  <code>$XDG_RUNTIME_DIR</code> – intended for small files.
-->
 
<!--
<br>
<br>
<hr>
<hr>
(*) Lustre is a parallel file system that allows concurrent and coherent file access at high data rates.
(*) Lustre is a parallel file system that allows concurrent and coherent file access at high data rates.
-->

Latest revision as of 19:50, July 18, 2022

Overview

Here is a summary of key directories related to Carbon, and the environment variables used to access them:

Environment variable Typical value Shared across nodes? Purge schedule Purpose
$HOME (same as ~ in shells) /home/joe yes See CNM data retention policy Your main configuration files and data.
$SANDBOX /sandbox/joe yes 6 weeks Scratch space for transient job data, not backed up
$TMPDIR (on login nodes) /tmp no 6 weeks General Unix scratch space
$TMPDIR (during jobs) /tmp/12345.sched1.... no at end of job Job-specific scratch space, provided empty on job start.
$PBS_O_WORKDIR (directory where qsub was run) yes (same as parent file system) typically used as cd $PBS_O_WORKDIR as first line in a job script

Home directory

$HOME
~  (tilde character)

Your home directory can be referred to in standard Unix fashion, as shown above, by either the environment variable or the tilde sign in most shells (but generally not application programs, especially not those written in Fortran).

  • Files are backed up nightly.
  • Your total file volume in $HOME is subject to (soft) quota of generally 0.5 TB.
  • You may exceed the soft limit by about 10% during a grace period of one week. You will see an over-quota notice upon login.
    If your usage remains above the soft limit beyond the grace time, the file system will appear (to you) as being full. To recover, delete files.

Your files in $HOME are subject to CNM's Data Retention Policy, which specifies that all your files may be deleted from our servers as early as 30 days after your last active proposal has expired. At that time, your access to Carbon and its SSH gateway will be revoked.

Global scratch space

$SANDBOX

This environment variable points to a user-specific directory, shared across nodes like the home-directory.

Use this directory for short-lived files that need to be shared among multiple nodes, can get large, numerous, or change often. To accommodate this, usage policies are stricter than for /home:

  • Files are not backed up.
  • Hard quota are 3 TB in volume and 2 million in file count.
  • Soft quota are 10 GB and 10,000 files.
  • The grace period for overflowing a soft limit is 3 weeks.
  • Files will be deleted automatically once they are older than 3 months.

These limits are subject to change.

The limits are aimed at keeping the space available for the intended use, typically for files of unusual size (F.O.U.S.) or for small files of unusual count.

Local scratch space

$TMPDIR

This variable and the directory it refers to is provided by the queueing system for all processes that execute a job. The directory:

  • resides on local disk on each node,
  • is named the same on each node,
  • is not shared across nodes,
  • is shared for processes on the same node (as many as given in "ppn=…"), in other words, the name is PBS job-specific, but not Unix PID-specific,
  • typically provides about 100 GB of space,
  • will be wiped upon job exit on each node.

The environment variable TMPDIR is not shared across nodes either. Either communicate it internal to your program, or have it exported by mpirun/mpiexec:

OpenMPI
mpirun … \
       -x TMPDIR \
       [-x OTHERVAR] \
       …
Intel MPI
mpiexec.hydra … \
       -genvlist TMPDIR[,OTHERVAR] \
       …

Use in job files

You can use $TMPDIR in one or more of the following ways:

  • direct your application to store its temporary files there, which is typically done by command line switches or an environment variable such as:
export FOO_SCRATCH=$TMPDIR
  • actually run your application there:
cd $TMPDIR
In this case, make sure you either copy your input files there or you specify full paths to $HOME or $PPBS_O_WORKDIR.
  • copy files back upon job termination:
#PBS -W stageout=$TMPDIR/foo.ext@localhost:$PBS_O_WORKDIR
#PBS -W stageout=$TMPDIR/*.bar@localhost:$PBS_O_WORKDIR/subdir
  • The string on the left of the @ character names the source files in the compute node file system.
  • The string on the right gives the destination host and directory, also as seen from the compute node. This means localhost refers to the primary compute node (rank 0 in MPI parlance). $PBS_O_WORKDIR by default stems from what was qsub's current directory on the submission node, but Carbon's user file systems are mounted on login nodes and compute nodes under the same paths.
  • You may give the stageout=… directive multiple times, as shown.
In contrast to explicit trailing "cp" commands in the job script, this copy will be executed even if a job overruns its walltime. See the qsub manual for further information.