Introduction to Using High Performance Computing¶
Introduction to High Performance Computing¶
In this session, we will learn the basics of high performance computing (HPC) at TACC, including system architecture, TACC HPC system resources, and best practices when using TACC systems.
Objectives for this Session¶
- Learn basic HPC system architecture
- Learn the difference between login and compute nodes
- Learn best practices when using a shared resource
- Learn about the disk/file system resources on TACC HPC systems
Introduction to High Performance Computing¶
Basic High Performance Computing (HPC) System Architecture¶
As you prepare to use TACC systems for this institute, it is important to understand the basic architecture. Think of an HPC resource as a very large and complicated lab instrument. Users need to learn how to:
- Interface with it / push the right buttons (Linux)
- Load samples (data)
- Run experiments (jobs)
- Interpret the results (data analysis / vis)

Login vs. Compute Nodes
As we’ve discussed, an HPC system has login nodes and compute nodes. We cannot run applications on the login nodes because they require too many resources and will interrupt the work of others. Instead, we must submit a job to a queue to run on compute nodes.
Tips for Success¶
Read the documentation.
- Learn node schematics, limitations, file systems, rules
- Learn about the scheduler, queues, policies
- Determine the right resource for the job
What are the disk/file system resources on TACC HPC systems?¶

A typical HPC cluster schematic at TACC is outlined above.
For example, Stampede2 uses:¶
/home1
- LustreFS
- ~1 PB overall capacity
- each user has 10 GB quota
- ENV VAR:
$HOME
/scratch
- LustreFS
- ~30 PB overall capacity
- Unlimited quota, but 10 day limit
- ENV VAR:
$SCRATCH
/work
- LustreFS
- 20 PB global share work file system
- each user has 1 TB quota
- ENV VAR:
$WORK
/tmp
Each compute node has a local /tmp
directory. /tmp
can be used to read/write files that do not need to be accessed by other tasks. Data stored in /tmp
is temporary, and lasts only for the duration of your job.
- Stampede2 SKX: 144 GB storage per compute node
- Stampede2 KNL: 107 GB storage per compute node for
normal
/large
, 32 GB storage per compute node fordevelopment
TACC HPC Storage Systems¶
/corral-repl
- GPFS/NFS
- 6x2 PB
- 6 PB in each of Austin and Arlington
- typical quota: 5 TB but varies based on need
- folder path provided upon allocation (ex.
/corral-repl/TACC/bio/ECR
)
RANCH TAPE Robot
- ~160 PB backing tape store
- quota varies based on need, though users are typically limited to 2 TB
- ENV VAR:
$ARCHIVER
,$ARCHIVE
Stockyard¶
Global File System
- The
$WORK
filesystem on Stockyard helps knit TACC HPC Systems together - Files on
$WORK
are present on most systems - 20 PB of storage capacity
- 100+ gigabytes per second of aggregate bandwidth
Getting Set Up¶
To log in to Stampede2, follow the instructions for your operating system below.
Mac / Linux¶
Open the application 'Terminal'
ssh username@stampede2.tacc.utexas.edu
(enter password)
(enter 6-digit token)
Windows¶
Windows users will need to install PuTTY to follow along. If you have not done so already, download the PuTTY “Windows Installer” here.
Once PuTTY is installed:
- Double-click the PuTTY icon
- In the PuTTY Configuration window make sure the Connection type is SSH
- enter
stampede2.tacc.utexas.edu
for Host Name - click “Open”
- answer “Yes” to the SSH security question
In the PuTTY terminal:
- enter your TACC user id after the “login as:” prompt, then Enter
- enter the password associated with your TACC account
- enter your 6-digit TACC token value
Open the application 'PuTTY'
enter Host Name: stampede2.tacc.utexas.edu
(click 'Open')
(enter username)
(enter password)
(enter 6-digit token)
Successful Login to Stampede2¶
If your login was successful, your terminal will look something like this:
------------------------------------------------------------------------------
Welcome to the Stampede2 Supercomputer
Texas Advanced Computing Center, The University of Texas at Austin
------------------------------------------------------------------------------
** Unauthorized use/access is prohibited. **
If you log on to this computer system, you acknowledge your awareness
of and concurrence with the UT Austin Acceptable Use Policy. The
University will prosecute violators to the full extent of the law.
TACC Usage Policies:
http://www.tacc.utexas.edu/user-services/usage-policies/
______________________________________________________________________________
Welcome to Stampede2, *please* read these important system notes:
--> Stampede2 user documentation is available at:
https://portal.tacc.utexas.edu/user-guides/stampede2
---------------------- Project balances for user ancantu ----------------------
| Name Avail SUs Expires | Name Avail SUs Expires |
| Project1 ##### 20XX-0X-XX | Project3 #### 20XX-0X-XX |
| Project2 ##### 20XX-0X-XX | Project4 #### 20XX-0X-XX |
------------------------ Disk quotas for user ancantu -------------------------
| Disk Usage (GB) Limit %Used File Usage Limit %Used |
| /home1 0.0 10.0 0.00 23 200000 0.01 |
| /work 12.6 1024.0 1.23 3131 3000000 0.10 |
| /scratch 0.0 0.0 0.00 4 0 0.00 |
-------------------------------------------------------------------------------
A Note About Quotas¶
The welcome message you receive upon successful login to Stampede2 has useful information for you to keep track of. Especially of note is the breakdown of disk quotas for your account, as you can keep an eye on whether your usage is nearing the determined limit.
Once your usage is nearing the quota, you’ll start to experience issues that will not only impact your own work, but also impact the system for others. For example, if you’re nearing your quota in $WORK
, and your job is repeatedly trying (and failing) to write to $WORK
, you will stress that file system.
Another useful way to monitor your disk quotas (and TACC project balances) at any time is to execute:
login1$ /usr/local/etc/taccinfo # Generally more current than balances displayed on the portals.
Introduction to the Command Line¶
Set Up Reminder¶
To log in to Stampede2, follow the instructions for your operating system below.
Mac / Linux
Open the application 'Terminal'
ssh username@stampede2.tacc.utexas.edu
(enter password)
(enter 6-digit token)
Windows
Open the application 'PuTTY'
enter Host Name: stampede2.tacc.utexas.edu
(click 'Open')
(enter username)
(enter password)
(enter 6-digit token)
Module Objectives¶
This module will be fully interactive. Participants are strongly encouraged to follow along on the command line. Even if you already have command line familiarity, please follow along because we will create files / directories that we use later on in the session. After completing this module, participants should be able to:
- Describe basic functions of essential Linux commands
- Use Linux commands to navigate a file system and manipulate files
- Edit files directly on a Linux system using a command line utility (e.g. vim, nano, emacs)
- Transfer data to a remote Linux file system
- Print, identify, and modify environment variables
Topics Covered¶
- Creating and changing folders (
pwd
,ls
,mkdir
,cd
,rmdir
)- Creating and manipulating files (
touch
,rm
,mv
,cp
)- Looking at the Contents of files (
cat
,more
,less
,head
,tail
,grep
)- Text editing with vim (insert mode, normal mode, navigating, saving, quitting)
- Network and file transfers (
hostname
,whoami
,logout
,ssh
,scp
,rsync
)
Creating and Changing Folders¶
On a Windows or Mac desktop, our present location determines what files and folders we can access. I can “see” my present location visually with the help of the graphic interface - I could be looking at my Desktop, or the contents of a folder, for example. In a Linux command-line interface, we lack the same visual queues to tell us what our location is. Instead, we use a command - pwd
(print working directory) - to tell us our present location. Try executing this command on Stampede2:
$ pwd
/home1/03439/wallen
This home location on the Linux filesystem is unique for each user, and it is roughly analogous to C:\Users\username on Windows, or /Users/username on Mac.
To see what files and folders are available at this location, use the ls
(list) command:
$ ls
I have no files or folders in my home directory yet, so I do not get a response. We can create some folders using the mkdir
(make directory) command. The words ‘folder’ and ‘directory’ are interchangeable:
$ mkdir folder1
$ mkdir folder2
$ mkdir folder3
$ ls
folder1 folder2 folder3
Now we have some folders to work with. To “open” a folder, navigate into that folder using the cd
(change directory) command. This process is analogous to double-clicking a folder on Windows or Mac:
$ pwd
/home/03439/wallen/
$ cd folder1
$ pwd
/home1/03439/wallen/folder1
Now that we are inside folder1
, make a few sub-folders:
$ mkdir subfolderA
$ mkdir subfolderB
$ mkdir subfolderC
$ ls
subfolderA subfolderB subfolderC
Use cd
to Navigate into subfolderA
, then use ls
to list the contents. What do you expect to see?
$ cd subfolderA
$ pwd
/home/03439/wallen/folder1/subfolderA
$ ls
There is nothing there because we have not made anything yet. Next, we will navigate back to the home directory. So far we have seen how to navigate “down” into folders, but how do we navigate back “up” to the parent folder? There are different ways to do it. For example, we could specify the complete path of where we want to go:
$ pwd
/home1/03439/wallen/folder1/subfolderA
$ cd /home1/03439/wallen/folder1
$ pwd
/home1/03439/wallen/folder1/
Or, we could use a shortcut, ..
, which refers to the parent folder - one level higher than the present location:
$ pwd
/home1/03439/wallen/folder1
$ cd ..
$ pwd
/home1/03439/wallen
We are back in our home directory. Finally, use the rmdir
(remove directory) command to remove folders. This will not work on folders that have any contents (more on this later):
$ mkdir junkfolder
$ ls
folder1 folder2 folder3 junkfolder
$ rmdir junkfolder
$ ls
folder1 folder2 folder3
A bonus command available on some Linux operating systems is called tree
. The tree
command displays files and folders in a hierarchical view. Use another Linux shortcut, .
, to indicate that you want to list files and folders in your present location.
$ tree .
.
|-- folder1
| |-- subfolderA
| |-- subfolderB
| `-- subfolderC
|-- folder2
`-- folder3
Before we move on, let’s remove the directories we have made, using rm -r
to remove our parent folder folder1
and its subfolders. The -r
command line option recursively removes subfolders and files located “down” the parent directory. -r
is required for non-empty folders.
$ rm -r folder1
$ ls
folder2 folder3
Which command should we use to remove folder2
and folder3
?
$ rmdir folder2
$ rmdir folder3
$ ls
Why could we use rmdir
on folder2
and folder3
, but not on folder1
?
Exercise¶
- Navigate to your home directory
- Make a new folder called
challenge01
- Navigate into that new folder
- Make 5 sub folders called
a
,b
,c
,d
,e
- Within each of those sub folders, make 5 sub folders called
1
,2
,3
,4
,5
- Navigate back to your home directory and print a hierarchical view of the
challenge01
folder
Review of Topics Covered¶
Command | Effect |
---|---|
pwd |
print working directory |
ls |
list files and directories |
ls -l |
list files in column format |
mkdir dir_name/ |
make a new directory |
cd dir_name/ |
navigate into a directory |
rmdir dir_name/ |
remove an empty directory |
rm -r dir_name/ |
remove a directory and its contents |
tree |
list files and directories hierarchically |
. or ./ |
refers to the present location |
.. or ../ |
refers to the parent directory |
Creating and Manipulating Files¶
We have seen how to navigate around the filesystem and perform operations with folders. But, what about files? Just like on Windows or Mac, we can easily create new files, copy files, rename files, and move files to different locations. First, we will navigate to the home directory and create a few new folders and files with the mkdir
and touch
commands:
$ cd # cd on an empty line will automatically take you back to the home directory
$ pwd
/home1/03439/wallen
$ mkdir folder1
$ mkdir folder2
$ mkdir folder3
$ touch file_a
$ touch file_b
$ touch file_c
$ ls
file_a file_b file_c folder1 folder2 folder3
These files we have created are all empty. Removing a file is done with the rm
(remove) command. Please note that on Linux file systems, there is no “Recycle Bin”. Any file or folder removed is gone forever and often un-recoverable:
$ touch junkfile
$ rm junkfile
Moving files with the mv
command and copying files with the cp
command works similarly to how you would expect on a Windows or Mac machine. The context around the move or copy operation determines what the result will be. For example, we could move and/or copy files into folders:
$ mv file_a folder1/
$ mv file_b folder2/
$ cp file_c folder3/
Before listing the results with ls
or tree
, try to guess what the result will be.
$ tree .
.
|-- file_c
|-- folder1
| `-- file_a
|-- folder2
| `-- file_b
`-- folder3
`-- file_c
Two files have been moved into folders, and file_c
has been copied - so there is still a copy of file_c
in the home directory. Move and copy commands can also be used to change the name of a file:
$ cp file_c file_c_copy
$ mv file_c file_c_new_name
By now, you may have found that Linux is very unforgiving with typos. Generous use of the <Tab>
key to auto-complete file and folder names, as well as the <UpArrow>
to cycle back through command history, will greatly improve the experience. As a general rule, try not to use spaces or strange characters in files or folder names. Stick to:
A-Z # capital letters
a-z # lowercase letters
0-9 # digits
- # hyphen
_ # underscore
. # period
Before we move on, let’s clean up once again by removing the files and folders we have created. Do you remember the command for removing non-empty folders?
$ rm -r folder1
$ rm -r folder2
$ rm -r folder3
How do we remove file_c_copy
and file_c_new_name
?
$ rm file_c_copy
$ rm file_c_new_name
Exercise¶
- Navigate to your home directory
- Execute this exact command:
cp -r /work/03439/wallen/public/challenge02 ./
- Navigate into the
challenge02
folder - Somewhere within there is a file. Can you find it?
Review of Topics Covered¶
Command | Effect |
---|---|
touch file_name |
create a new file |
rm file_name |
remove a file |
rm -r dir_name/ |
remove a directory and its contents |
mv file_name dir_name/ |
move a file into a directory |
mv old_file new_file |
change the name of a file |
mv old_dir/ new_dir/ |
change the name of a directory |
cp old_file new_file |
copy a file |
cp -r old_dir/ new_dir/ |
copy a directory |
<Tab> |
autocomplete file or folder names |
<UpArrow> |
cycle through command history |
Looking at the Contents of Files¶
Everything we have seen so far has been with empty files and folders. We will now start looking at some real data. Navigate to your home directory, then issue the following cp
command to copy a “tar archive” file from me:
$ cd ~ # the tilde ~ is also a shortcut referring to your home directory
$ pwd
/home1/03439/wallen
$ cp /work/03439/wallen/public/IntroToLinuxHPC.tar .
Try to use <Tab>
to autocomplete the name of the file. Do NOT change the username or lustre number to your username and lustre number - in this case you are copying a file from my directory to your directory. Also, please notice the single dot .
at the end of the copy command, which indicates that you want to cp the tar archive to .
, this present location (your home directory).
This archive file is actually a bundle of many files and folders, all packed into one nice, easily transportable object. Once it is copied over, un-pack the file with the following command:
$ ls
IntroToLinuxHPC.tar
$ tar -xvf IntroToLinuxHPC.tar
In the first lab (Lab01
), we have a file called websters.txt
. This a list of all the words in Webster’s Dictionary. But how can we see the contents of the file?
$ cd IntroToLinuxHPC
$ cd Lab01
$ pwd
/home1/03439/wallen/IntroToLinuxHPC/Lab01
$ ls
README websters.txt
If you want to see the contents of a file, use the cat
command to print it to screen:
$ cat websters.txt
A
a
aa
aal
aalii
aam
Aani
aardvark
This is a long file! Printing everything to screen is much too fast and not very useful. We can use a few other commands to look at the contents of the file with more
control:
$ more websters.txt
Press the <Enter>
key to scroll through line-by-line, or the <Space>
key to scroll through page-by-page. Press q
to quit the view, or <Ctrl+c>
to force a quit if things freeze up. A %
indicator at the bottom of the screen shows your progress through the file. This is still a little bit messy and fills up the screen. The less
command has the same effect, but is a little bit cleaner:
$ less websters.txt
Scrolling through the data is the same, but now we can also search the data. Press the /
forward slash key, and type a word that you would like to search for. The screen will jump down to the first match of that word. The n
key will cycle through other matches, if they exist.
Finally, you can view just the beginning or the end of a file with the head
and tail
commands. For example:
$ head websters.txt
$ tail websters.txt
The >
and >>
shortcuts in Linux indicate that you would like to redirect the output of one of the commands above. Instead of printing to screen, the output can be redirected into a file:
$ cat websters.txt > websters_new.txt
$ head websters.txt > first_10_lines.txt
A single greater than sign >
will redirect and overwrite any contents in the target file. A double greater than sign >>
will redirect and append any output to the end of the target file.
One final useful way to look at the contents of files is with the grep
command. grep
searches a file for a specific pattern, and returns all lines that match the pattern. For example:
$ grep "banana" websters.txt
banana
cassabanana
Although it is not always necessary, it is safe to put the search term in quotes. More on grep
later.
Exercise¶
- Extract every word from
websters.txt
that contains the stringapple
, and put it into a new file calledapple.txt
. - Extract every word from
websters.txt
that contains the stringcarrot
, and put it into a new file calledcarrot.txt
. - Extract every word from
websters.txt
that contains the stringcheese
, and put it into a new file calledcheese.txt
. - Examine the contents of
apple.txt
,carrot.txt
, andcheese.txt
to make sure they contain what you expect. - Concatenate all three lists into a new file called
food.txt
.
Review of Topics Covered¶
Command | Effect |
---|---|
cat file_name |
print file contents to screen |
cat file_name >> new_file |
redirect output to new file |
more file_name |
scroll through file contents |
less file_name |
scroll through file contents |
head file_name |
output beginning of file |
tail file_name |
output end of a file |
grep pattern file_name |
search for ‘pattern’ in a file |
~/ |
shortcut for home directory |
<Ctrl+c> |
force interrupt |
> |
redirect and overwrite |
>> |
redirect and append |
Text Editing with VIM¶
VIM is a text editor used on Linux file systems.
Open a file (or create a new file if it does not exist):
$ vim file_name
There are two “modes” in VIM that we will talk about today. They are called “insert mode” and “normal mode”. In insert mode, the user is typing text into a file as seen through the terminal (think about typing text into TextEdit or Notepad). In normal mode, the user can perform other functions like save, quit, cut and paste, find and replace, etc. (think about clicking the menu options in TextEdit or Notepad). The two main keys to remember to toggle between the modes are i
and Esc
.
Entering VIM insert mode:
> i
Entering VIM normal mode:
> Esc
A summary of the most important keys to know for normal mode are (more on your cheat sheet):
# Navigating the file:
arrow keys move up, down, left, right
Ctrl+u page up
Ctrl+d page down
0 move to beginning of line
$ move to end of line
gg move to beginning of file
G move to end of file
:N move to line N
# Saving and quitting:
:q quit editing the file
:q! quit editing the file without saving
:w save the file, continue editing
:wq save and quit
- For more information, see:
- http://openvim.com/
- http://vim-adventures.com/
- Or type on the command line:
vimtutor
Exercise¶
- Navigate to the
Lab03
directory. - Open up the file called
tutorial.txt
and follow the instructions within. - Create a new file called
my_name.txt
, write your name within the file, save and quit, then print the contents of the file to screen.
Review of Topics Covered¶
Command | Effect |
---|---|
vim file.txt |
open “file.txt” and edit with vim |
i |
toggle to insert mode |
<Esc> |
toggle to normal mode |
<arrow keys> |
navigate the file |
:q |
quit ending the file |
:q! |
quit editing the file without saving |
:w |
save the file, continue editing |
:wq |
save and quit |
Network and File Transfers¶
In order to login or transfer files to a remote Linux file system, you must know the hostname (unique network identifier) and the username. If you are already on a Linux file system, those are easy to determine using the following commands:
$ whoami
wallen
$ hostname -f
login4.stampede2.tacc.utexas.edu
Given that information, a user would remotely login to this Linux machine using the Terminal command:
$ ssh wallen@stampede2.tacc.utexas.edu
(enter password)
(enter token)
Windows users would typically use the program PuTTY to perform this operation. Logging out of a remote system is done using the logout
command, or the shortcut <Ctrl+d>
:
[stampede2]$ logout
[local]$
To practice transferring files to Stampede2’s $WORK
and $SCRATCH
, we need to identify the path to our $WORK
and $SCRATCH
directory. To identify these paths, we can use helpful command shortcuts.
To identify the path to our $WORK
directory, we can use cd $WORK
or the helpful shortcut cdw
:
$ cdw
$ pwd
/work/03439/wallen/stampede2
To identify the path to our $SCRATCH
directory, we can use cd $SCRATCH
or the helpful shortcut cds
:
$ cds
$ pwd
/scratch/03439/wallen
Copying files from your local computer to Stampede2’s $WORK
would require the scp
command (Windows users use the program “WinSCP”):
[local]$ scp my_file wallen@stampede2.tacc.utexas.edu:/work/03439/wallen/stampede2
(enter password)
(enter token)
In this command, you specify the name of the file you want to transfer (my_file
), the username (wallen
), the hostname (stampede2.tacc.utexas.edu
), and the path you want to put the file (/work/03439/wallen/stampede2
). Take careful notice of the separators including spaces, the @ symbol, and the colon.
Copying files from your local computer to Stampede2’s $SCRATCH
using scp
:
[local]$ scp my_file wallen@stampede2.tacc.utexas.edu:/scratch/03439/wallen
(enter password)
(enter token)
Copy files from Stampede2 to your local computer using the following:
[local]$ scp wallen@stampede2.tacc.utexas.edu:/work/03439/wallen/stampede2/my_file ./
(enter password)
(enter token)
Note: If you wanted to copy my_file
from $SCRATCH
, the path you would specify after the colon would be /scratch/03439/wallen/my_file
.
Instead of files, full directories can be copied using the “recursive” flag (scp -r ...
). The rsync
tool is an advanced copy tool that is useful for synching data between two sites. Although we will not go into depth here, example rsync
usage is as follows:
$ rsync -azv local remote
$ rsync -azv remote local
This is just the basics of copying files. See example scp
usage here and example rsync
usage here.
Exercise¶
- Identify which Stampede2 login node you are on (login1, login2, login3)
- Remotely login to a different Stampede2 login node and list what files are available.
- Logout until you are back to your original login node.
- Make your own
my_file
on your local computer using knowledge from our previous sections and copymy_file
to your$WORK
file system on Stampede2
Review of Topics Covered¶
Command | Effect |
---|---|
hostname -f |
print hostname |
whoami |
print username |
ssh username@hostname |
remote login |
logout |
logout |
cd $WORK , cdw |
navigate to $WORK file system |
cd $SCRATCH , cds |
navigate to $SCRATCH file system |
scp local remote |
copy a file from local to remote |
scp remote local |
copy a file from remote to local |
rsync -azv local remote |
sync files between local and remote |
rsync -azv remote local |
sync files between remote and local |
<Ctrl+d> |
logout of host |
Running Jobs¶
Module Objectives¶
This module will be fully interactive. Participants are strongly encouraged to follow along on the command line. After taking this module, participants should be able to:
- Prepare and submit a batch job to a queue
- Initiate an interactive session using idev
Topics Covered¶
- Batch job submission, commands dependent on system (
sbatch
,showq
,scancel
).- Interactive sessions with
idev
(idev
,idev -help
)
Batch Job Submission¶
As we discussed before, on Stampede2 there are login nodes and compute nodes.

We cannot run the applications we need for our research on the login nodes because they require too many resources and will interrupt the work of others. Instead, we must write a short text file containing a list of the resources we need, and containing the command(s) for running the application. Then, we submit that text file to a queue to run on compute nodes. This process is called batch job submission.
There are several queues available on Stampede2. It is important to understand the queue limitations and pick a queue that is appropriate for your job. Documentation can be found here. Today, we will be using the development
queue which has a max runtime of 2 hours, and users can only submit one job at a time.
First, navigate to the Lab04
directory where we have an example job script prepared, called job.slurm
:
$ cd
$ cd IntroToLinuxHPC/Lab04
$ cat job.slurm
#!/bin/bash
#----------------------------------------------------
# Example SLURM job script to run applications on
# TACCs Stampede2 system.
#----------------------------------------------------
#SBATCH -J # Job name
#SBATCH -o # Name of stdout output file (%j expands to jobId)
#SBATCH -p # Queue name
#SBATCH -N # Total number of nodes requested (16 cores/node)
#SBATCH -n # Total number of mpi tasks requested
#SBATCH -t # Run time (hh:mm:ss)
#SBATCH -A # <-- Allocation name to charge job against
# Everything below here should be Linux commands
First, we must know an application we want to run, and a research question we want to ask. This generally comes from your own research. For this example, we want to use the application called autodock_vina
to check how well a small molecule ligand fits within a protein binding site. All the data required for this job is in a subdirectory called data
:
$ pwd
/home1/03439/wallen/IntroToLinuxHPC/Lab04
$ ls
data job.slurm results
$ ls data/
configuration_file.txt ligand.pdbqt protein.pdbqt
$ ls results/
# nothing here yet
Next, we need to fill out job.slurm
to request the necessary resources. I have some experience with autodock_vina
, so I can reasonably predict how much we will need. When running your first jobs with your applications, it will take some trial and error, and reading online documentation, to get a feel for how many resources you should use. Open job.slurm
with VIM and fill out the following information:
#SBATCH -J vina_job # Job name
#SBATCH -o vina_job.o%j # Name of stdout output file (%j expands to jobId)
#SBATCH -p development # Queue name
#SBATCH -N 1 # Total number of nodes requested (16 cores/node)
#SBATCH -n 1 # Total number of mpi tasks requested
#SBATCH -t 00:10:00 # Run time (hh:mm:ss)
#SBATCH -A TRAINING-HPC # <-- Allocation name to charge job against
Now, we need to provide instructions to the compute node on how to run autodock_vina
. This information would come from the autodock_vina
instruction manual. Continue editing job.slurm
with VIM, and add this to the bottom:
# Everything below here should be Linux commands
echo "starting at:"
date
module list
module load intel/17.0.4
module load boost/1.64
module load autodock_vina/1.1.2
module list
cd data/
vina --config configuration_file.txt --out ../results/output_ligands.pdbqt
echo "ending at:"
date
The way this job is configured, it will print a starting date and time, load the appropriate modules, run autodock_vina
, write output to the results/
directory, then print the ending date and time. Keep an eye on the results/
directory for output. Once you have filled in the job description, save and quit the file. Submit the job to the queue using the sbatch
command`:
$ sbatch job.slurm
To view the jobs you have currently in the queue, use the showq
or squeue
commands:
$ showq -u
$ showq # shows all jobs by all users
$ squeue -u $USERNAME
$ squeue # shows all jobs by all users
If for any reason you need to cancel a job, use the scancel
command with the 6- or 7-digit jobid:
$ scancel jobid
For more example scripts, see this directory on Stampede2:
$ ls /share/doc/slurm/
If everything went well, you should have an output file named something similar to vina_job.o864828
in the same directory as the job.slurm
script. And, in the results/
directory, you should have some output:
$ cat vina_job.o864828
# closely examine output
$ ls results
output_ligands.pdbqt

(Output visualized in UCSF Chimera)
Congratulations! You ran a batch job on Stampede2!
Interactive Sessions with idev¶
The idev
utility initiates an interactive session on one or more compute nodes so that you can issue commands and run code as if you were doing so on your personal machine. An interactive session is useful for development, as you can quickly compile, run, and validate code. Accessing a single compute node is accomplished by simply executing idev
on any of the TACC systems.
Initiating an Interactive Session¶
To learn about the command line options available for idev
, use idev -help
.
login1$ idev -help
...
OPTION ARGUMENTS DESCRIPTION
-A account_name sets account name (default: in .idevrc)
-m minutes sets time in minutes (default: 30)
-p queue_name sets queue to named queue (default: -p development)
-r resource_name sets hardware
-t hh:mm:ss sets time to hh:mm:ss (default: 00:30:00)
-help [--help ] displays (this) help message
-v [--version ] output version information and exit
Let’s go over some of the most useful idev
command line options that can customize your interactive session:
To change the time limit to be lesser or greater than the default 30 minutes, users can use the -m
command line option. For example, a user requesting an interactive session for an hour would use the command line option -m 60
.
To change the account_name associated with the interactive session, users can use the -A
command line option. This option is useful for when a user has multiple allocations they belong to. For example, if I have allocations on accounts TACC
and Training
, I can use -A
to set the allocation I want to be used like so: -A TACC
or -A Training
.
To change the queue to be different than the default development
queue, users can use the -p
command line option. For example, if a user wants to launch an interactive session on one of Stampede2’s Skylake
(SKX) compute nodes, they would use the command line option -p skx-dev
or -p skx-normal
. Note that the -p skx-large
queue will be out of scope for most project users. You can learn more about the different queues of Stampede2 here.
Note: For the scope of this section, we will be using the default development
queue.
To start a thirty-minute interactive session on a compute node in the development queue with our TRAINING-HPC
allocation:
login1$ idev -A TRAINING-HPC
If launch is successful, you will see output that includes the following excerpts:
...
-----------------------------------------------------------------
Welcome to the Stampede2 Supercomputer
-----------------------------------------------------------------
...
-> After your idev job begins to run, a command prompt will appear,
-> and you can begin your interactive development session.
-> We will report the job status every 4 seconds: (PD=pending, R=running).
-> job status: PD
-> job status: R
-> Job is now running on masternode= c449-0015...OK
...
c449-0015[knl](268)$
Exercise¶
Let’s revisit the job we ran in the previous section. This time, we will be going through each command we entered into job.slurm
interactively.
c449-0015[knl](268)$ pwd
/home1/03439/wallen/IntroToLinuxHPC/Lab04
c449-0015[knl](269)$ ls
data job.slurm results vina_job.o864828
c449-0015[knl](270)$ echo "starting at:"
starting at:
c449-0015[knl](271)$ date
Mon Jun 29 0X:XX:XX CDT 2020
c449-0015[knl](272)$ module list
Currently Loaded Modules:
# it is okay if you have loaded modules from past sessions
c449-0015[knl](273)$ module load intel/17.0.4
c449-0115[knl](274)$ module load boost/1.64
c449-0115[knl](275)$ module load autodock_vina/1.1.2
c449-0115[knl](276)$ module list
Currently Loaded Modules:
1) intel/17.0.4
2) boost/1.64
3) autodock_vina/1.1.2 #the order in which the modules are listed does not matter
c449-0015[knl](277)$ cd data/
c449-0015[knl](278)$ vina --config configuration_file.txt --out ../results/output_ligands.pdbqt
#################################################################
# If you used AutoDock Vina in your work, please cite: #
# #
# O. Trott, A. J. Olson, #
# AutoDock Vina: improving the speed and accuracy of docking #
# with a new scoring function, efficient optimization and #
# multithreading, Journal of Computational Chemistry 31 (2010) #
# 455-461 #
# #
# DOI 10.1002/jcc.21334 #
# #
# Please see http://vina.scripps.edu for more information. #
#################################################################
Detected 272 CPUs
WARNING: at low exhaustiveness, it may be impossible to utilize all CPUs
Reading input ... done.
Setting up the scoring function ... done.
Analyzing the binding site ... done.
Using random seed: -31156704
Performing search ...
0% 10 20 30 40 50 60 70 80 90 100%
|----|----|----|----|----|----|----|----|----|----|
***************************************************
done.
Refining results ... done.
mode | affinity | dist from best mode
| (kcal/mol) | rmsd l.b.| rmsd u.b.
-----+------------+----------+----------
1 -12.3 0.000 0.000
2 -11.1 1.223 1.866
3 -11.0 3.000 12.459
4 -10.5 2.268 12.434
5 -10.4 2.272 13.237
6 -10.3 3.146 13.666
7 -10.3 3.553 12.345
8 -10.2 1.827 13.667
9 -9.8 2.608 12.630
Writing output ... done.
c449-0015[knl](279)$ echo "ending at:"
c449-0015[knl](280)$ date
Mon Jun 29 0X:XX:XX CDT 2020
To exit an interactive session, you can either use logout
or wait until the connection to the compute node is closed by the remote host.