Getting Started with HPCC

HPC terminology

HPC terminology regarding nodes, cores, processors and tasks taken from the LUNARC Aurora pages on the subject.

Term Explanation Fermi
node A physical computer 40
processor a multi-core processor housing many processing elements 2 per node
socket plug where processor is placed, synonym for the processor 2 per node
core individual processing element 6-16 per node
task software process with own data & instructions forking multiple threads specified in sbatch script
thread An instruction stream sharing data with other threads from the task specified in sbatch script

Table 1. Terminology regarding resources as defined in the HPC community

HPC vs XDS terminology

Term XDS terminology
tasks JOBS (i.e. MAXIMUM_NUMBER_OF_JOBS)
threads PROCESSORS (i.e. MAXIMUM_NUMBER_OF_PROCESSORS)

Table 2. HPC vs XDS terminology

Useful HPC commands

HPC command Consequence
interactive -N 1 –exclusive -t 00:30:00 -A snic2018-3-251 Get 30 min terminal window at compute node for project
interactive –nodes=1 –exclusive -t 00:30:00 -A snic2018-3-25 Get 30 min terminal window at compute node for project
exit leave terminal window, save compute time on project
squeue -u x_user Check my jobs in running or in the SLURM queue
top see all jobs running on current node
top -U username see my jobs running on current node
scancel JOBID Kill my job with JOBID
module load XDSAPP Load XDSAPP and dependencies CCP4, PHENIX, XDS, XDSTAT…
module avail What modules are there
module purge unload all modules

Table 3. Basic HPC commands

Tetralith and Aurora differences

Tetralith vs Aurora outcome
jobsh n1024 Access a node when in use at NSC Tetralith
ssh au118 Access a node when in use at Lunarc Aurora

Table 4. Command-line commands differing between Lunarc Aurora and NSC Tetralith

SBATCH and checking the queue

Running sbatch scripts is the most efficient way of using HPC compute time since once the job is finished, the clock counting compute time is stopped. Every sbatch script require a maximum time for the job to finish by #SBATCH -t 00:30:00 before the job can be scheduled into the queue. To check status of jobs sumitted by sbatch use squeue -u username and obtain

Figure 1. Output of squeue -u username. The first column gives the job ID, the second the partition (or queue) where the job was submitted, the third the name of the job (specified by the user in the submission script) and the fourth the owner of the job. The fifth is the status of the job (R=running, PD=pending, CA=cancelled, CF=configuring, CG=completing, CD=completed, F=failed). The sixth column gives the elapsed time for each particular job. Finally, there are the number of nodes requested and the nodelist where the job is running (or the cause that it is not running).

Now it is possible to access the compute nodes and check job status in more detail using top or top -U username. This is done in two different ways at NSC Triolith and LUNARC Aurora:

  • At LUNARC Aurora use: ssh au118
  • At NSC Tetralith use: jobsh n1024

Once your terminal window is at the compute node you can check status by top

Figure 2. Result of top given at compute node. Using top the status of the job is indicated by (D=uninterruptible sleep, R=running, S=sleeping, T=traced or stopped, Z=zombie)

or by top -U username

Figure 3. Result of top -U username given at compute node

The interactive command can use the same parameters as sbatch below.

SBATCH script line Consequence
#!/bin/sh Use sh to interpret the script
#SBATCH -t 0:30:00 Run the sh script for maximum 30 min
#SBATCH –nodes=2 –exclusive Allocate two full nodes for this sh script
#SBATCH -A snic2017-1-XXX Count compute time on project snic2017-1-XXX
#SBATCH –mail-type=ALL Send email when job start and stops
#SBATCH –mail-user=name.surname@lu.se Send email to name.surname@lu.se

Table 5. SBATCH script lines. The interactive command is using the same terminology, however usually given in a single row at the login node as interactive --nodes=2 --exclusive -t 00:30:00 -A snic2017-1-XXX

Useful LINUX commands

  • rsync -rvplt ./data username@aurora.lunarc.lu.se:/lunarc/nobackup/users/username/. data directory copied to lunarc
  • scp file.pdb username@aurora.lunarc.lu.se:/lunarc/nobackup/users/username/. single file.pdb copied to lunarc