HPC/FAQ: Difference between revisions
mNo edit summary |
|||
Line 1: | Line 1: | ||
== To report a general problem == | |||
Send mail to [mailto:[email protected] [email protected]]. | |||
* Choose an appropriate subject line. (Do not merely hit "Reply" on a previous unrelated message.) | |||
* Be as '''specific as possible''' – make sure your request includes the following: | |||
** The ''command'' you are trying to run or the menu item you choose. | |||
** The ''exact error message'' you get. Copy & Paste the message text, or take a screenshot and attach it to your mail. | |||
* Review your subject line again before you submit the message. | |||
* To help us diagnose a problem, read [http://www.chiark.greenend.org.uk/~sgtatham/bugs.html How to Report Bugs Effectively], by [http://www.chiark.greenend.org.uk/~sgtatham/ Simon Tatham], programmer.<!-- of [http://www.chiark.greenend.org.uk/~sgtatham/putty/ PuTTY] fame. --> | |||
When you get our response: | |||
* Carefully read it. | |||
* Follow all instructions. Do not skip steps. | |||
* Answer all questions. Do not ignore questions. | |||
* For account- and password-related issues, it may take several hours for changes to take effect. If your initial attempt fails, wait at least that long before retrying. | |||
For deeper reading, consider the following: | |||
* [http://www.catb.org/~esr/faqs/smart-questions.html How To Ask Questions The Smart Way], by [http://en.wikipedia.org/wiki/Eric_S._Raymond Eric S. Raymond], open source pioneer. | |||
<!-- * Also, the [http://mywiki.wooledge.org/XyProblem X-Y Problem] | |||
--> | |||
== "I cannot log in" or "My password does not work" == | == "I cannot log in" or "My password does not work" == | ||
; Check host names: | ; Check host names: | ||
Line 96: | Line 117: | ||
; Available: Relevant quantity for new jobs. Must be positive for a new job to start, and large enough to Reserve the entire job. | ; Available: Relevant quantity for new jobs. Must be positive for a new job to start, and large enough to Reserve the entire job. | ||
Available = Balance + CreditLimit | Available = Balance + CreditLimit | ||
Revision as of 17:26, May 19, 2014
To report a general problem
Send mail to [email protected].
- Choose an appropriate subject line. (Do not merely hit "Reply" on a previous unrelated message.)
- Be as specific as possible – make sure your request includes the following:
- The command you are trying to run or the menu item you choose.
- The exact error message you get. Copy & Paste the message text, or take a screenshot and attach it to your mail.
- Review your subject line again before you submit the message.
- To help us diagnose a problem, read How to Report Bugs Effectively, by Simon Tatham, programmer.
When you get our response:
- Carefully read it.
- Follow all instructions. Do not skip steps.
- Answer all questions. Do not ignore questions.
- For account- and password-related issues, it may take several hours for changes to take effect. If your initial attempt fails, wait at least that long before retrying.
For deeper reading, consider the following:
- How To Ask Questions The Smart Way, by Eric S. Raymond, open source pioneer.
"I cannot log in" or "My password does not work"
- Check host names
- Make sure you connect to the correct host name, which is mega.cnm.anl.gov for the SSH gateway and clogin.cnm.anl.gov when connecting from an onsite work computer or over VPN. Do not use "carbon". -- See HPC/Network Access.
- Check eligibility
-
- Review and renew your User Registration details at https://beam.aps.anl.gov/pls/apsweb/ufr_main_pkg.usr_start_page . Background: If you are not a citizen of the US you require a current visa for Argonne computer access, as if your were visiting in person. On expiration of previous registration items you may have had at Argonne, your computer account will be disabled. This may unfortunately happen in the middle of your proposal's lifetime, and you may suddenly find that you can no longer access mega.
- Review your proposal's expiration time – ask your PI to search his or her email archive for mails with "Work Approval Received" or "Proposal Expiration" in the subject. These are the emails by which we communicate proposal status to the principal investigator (PI). Access to mega requires an active proposal. Access will be disabled approximately 6 weeks after your last active proposal expired.
- Note that it may take about 1–3 hours for a new proposal to be recognized as active again on mega.
- Access to mega requires that the User Work Submittal for a proposal contain your badge number. If that was left empty at original submission (such as when you are a newly registered user), ask the User Office or your Scientific Contact to augment and resubmit the form.
- Verify your password
- Visit https://credentials.anl.gov/ and verify that your username and password are correct.
- Request a password reset
-
- To have your password reset, email the CNM User Office, at [email protected].
- When you connect to mega with still your temporary password in place, mega will ask for a new password. You can safely change your password at this point.
- You can also change your password at https://credentials.anl.gov/ - However, a change there will take a few hours to become active on mega.
- Review instructions
-
- Read again HPC/Network Access, and follow the instructions for your platform.
- Request help for general network or connectivity issues
- Contact the NST IT help desk at [email protected] .
- Be as specific as possible – make sure your request includes the following:
- The command you are trying to run or the menu item you choose.
- The hostname you're trying to reach.
- The username you use to connect.
- The exact error message you get. Copy & Paste the message text, or take a screenshot and attach it to your mail.
- The software name and version you use to connect (e.g. SSH, VNC, or a browser).
- Your operating system and version.
"I'd like to use program X"
- Check if the application is already available on Carbon
Either:
- Browse the Application Catalog, or
- View the catalog on the Carbon command line:
module avail module -l avail 2>&1 | less
- The second form gives you browsable output.
- If you cannot find the application
- Describe the problem you are trying to solve – it may well be that we can suggest an alternative solution.
- Provide one or more URLs relevant to software you have in mind – be specific.
"How do I run program X?"
- Customize your shell environment to load the application module.
- Learn about module conventions on Carbon.
- To determine the names of a package's executable scripts and binaries, inspect its
$NAME_HOME/bin
directory. For instance, for the Quantum-ESPRESSO package:
ls $QUANTUM_ESPRESSO_HOME/bin
"How do I use program X?"
Read the package's documentation, using one or more of the following:
- Inspect the package's
$NAME_HOME/share
or$NAME_HOME/doc
directory on Carbon (see module conventions). - Browse the package's web page, generally mentioned in the
module help
text or the application catalog entry. - Consult a package's man pages. Few packages have them. Man page files are generally installed under
$NAME_HOME/man
or$NAME_HOME/share/man
and if so, will be made available automatically to theman
command.
What's my account balance?
Simple answer: mybalance
To find out how many core-hours you have available, the simplest command to run is:
mybalance -h
Project Machines Balance -------- -------- ---------- user ANY 993.26 cnm34567 ANY 158760.93 cnm31234 ANY -148893.62
The table gives all the Project
s you have access to (for use with the qsub -A
argument), and their balance.
Machine
lists all systems that can book jobs against your allocations. Carbon is currently the only machine that can do so.
Balance
is your account balance, in core-hours, as selected by the -h command option. This is the most useful and recommended unit.
Without -h, you get core-seconds, which are integers but rather more unwieldy numbers.
- The "user" project provides you with a small initial startup allocation of typically 1000 core-hours.
- When a Balance is reported as negative, that account typically has a CreditLimit assigned, which permits the balance to dip below zero. These details, however, are not shown by
mybalance
.
Complete answer: gbalance
To get allocation details for accounts that have CreditLimits, run the gbalance
command. Pass on -u username or -p projectname to select your allocations:
gbalance -h -u $USER
- Use the literal string
$USER
which makes the shell fill in your actual username.
The ouput looks like:
Id Name Amount Reserved Balance CreditLimit Available --- -------- ---------- -------- ---------- ----------- --------- 100 cnm31234 -148893.62 0.00 -148893.62 150000.00 1106.38 217 kpelzer 993.26 0.00 993.26 0.00 993.26 123 cnm34567 166440.93 7680.00 158760.93 0.00 158760.93
The most relevant column for you is Available. The units, given the -h option, are again core-hours.
The colums and their meanings are:
- Id
- an internal number for the account.
- Name
- The project name (for use with
qsub -A
or#PBS -A
). - Amount
- Amount for transactions completely on the books for the project account; does not include running jobs or credits. Deposits are allocated by the User Office and implemented by the Carbon administrator.
- Reserved
- Amounts held in reserve by all running jobs using this account. The reserve ensures that a job does not cause an overdraft when it finishes and when its actual use will be booked. The quantity is calculated by walltime * number of cores blocked. When a job terminates, the charge according to the actual time used will be subtracted from Amount, and the unused quantities will be re-added to Amount.
- Balance
- Available for new jobs; may go negative if CreditLimits are in place.
Balance = Amount - Reserved
- CreditLimit
- Amount by which Balance may go negative; assigned by the Carbon administrator.
- Available
- Relevant quantity for new jobs. Must be positive for a new job to start, and large enough to Reserve the entire job.
Available = Balance + CreditLimit