Reading HPC with Python

Last updated: January 4, 2023

Note on notation:   expressions between the < and > signs need to be replaced by the relevant information (without those signs).

Whether you are using our training cluster (which is a virtual machine hosted in the cloud and mimicking a Compute Canada cluster) or one of Compute Canada clusters (e.g. Cedar or Graham), when you ssh into the cluster, you "arrive" on the login node.

Do not run anything computationally intensive on this node. To run your code, you need to start an interactive job or submit a batch job to Slurm (the job scheduler used by the Compute Canada clusters).

If you are not familiar with HPC, you can go through the material of the HPC course we offered in this summer school, the training resources on the WestGrid website, and the Compute Canada introductory videos.

This lesson goes over what we will do in this course, using our training cluster (of course, if you run the code locally (on your computer), you can simply launch Python).

Plots

Do not run code that displays plots on screen. Instead, have them written to files.

Data

Copy files to/from the cluster

Few files

If you need to copy files to or from the cluster, you can use scp.

From your computer

If you are in a local shell, run:

[local]$ scp </local/path/to/file> <user>@<hostname>:<path/in/cluster>

(Replace <user> by your user name and <hostname> by the hostname—for this workshop: uu.c3.ca.)

From the cluster

If you are in a remote shell (through ssh), run:

[remote]$ scp <user>@<hostname>:<cluster/path/to/file> </local/path>

Large amount of data

Use Globus for large data transfers.

Note that Compute Canada is planning to store classic ML datasets on its clusters in the future. So if your research uses a common dataset, it may be worth inquiring whether it might be available before downloading a copy.

Large collections of files

The Compute Canada clusters are optimized for very large files and are slowed by large collections of small files. Datasets with many small files need to be turned into single-file archives with tar . Failing to do so will affect performance not just for you, but for all users of the cluster.

$ tar cf <data>.tar <path/to/dataset/directory>/*

Notes:

  • If you want to also compress the files, replace tar cf with tar czf
  • As a modern alternative to tar , you can use Dar

Interactive jobs

Interactive jobs are useful for code testing and development. They are not however the most efficient way to run code, so you should limit their use to testing and development.

You start an interactive job with:

$ salloc --account=def-<account> --cpus-per-task=<n> --gres=gpu:<n> --mem=<mem> --time=<time>

Our training cluster does not have GPUs, so for this workshop, do not use the --gres=gpu:<n> option.

For the workshop, you also don't have to worry about the --account=def-<account> option (or, if you want, you can use --account=def-sponsor00).

Our training cluster has a total of 60 CPUs on 5 compute nodes. Since there are many of you in this workshop, please be very mindful when running interactive jobs: if you request a lot of CPUs for a long time, the other workshop attendees won't be able to use the cluster anymore until your interactive job requested time ends (even if you aren't running any code).

Here are my suggestions so that we don't run into this problem:

  • Only start interactive jobs when you need to understand what Python is doing at every step, or to test, explore, and develop code (so where an interactive Python shell is really beneficial). Once you have a model, submit a batch job to Slurm instead
  • When running interactive jobs on this training cluster, only request 1 CPU (so --cpus-per-task=1)
  • Only request the time that you will really use (e.g. for the lesson on Python tensors, maybe 30 min to 1 hour seems reasonable)
  • If you don't need your job allocation anymore before it runs out, you can relinquish it with Ctrl+d

Note:
In this workshop, we will submit jobs from /home. This is fine on some of the Compute Canada clusters as well. Be aware however that you are not allowed to do that on Cedar. Instead, on Cedar, you have to submit jobs from /scratch or /project.

Batch jobs

As soon as you have a working Python script, you want to submit a batch job instead of running an interactive job. To do that, you need to write an sbatch script.

Job script

Here is an example script:

#!/bin/bash
#SBATCH --job-name=<name>*			  # job name
#SBATCH --account=def-<account>
#SBATCH --time=<time>				  # max walltime in D-HH:MM or HH:MM:SS
#SBATCH --cpus-per-task=<number>      # number of cores
#SBATCH --gres=gpu:<type>:<number>    # type and number of GPU(s) per node
#SBATCH --mem=<mem>					  # max memory (default unit is MB) per node
#SBATCH --output=%x_%j.out*		  # file name for the output
#SBATCH --error=%x_%j.err*		  # file name for errors
#SBATCH --mail-user=<email_address>*
#SBATCH --mail-type=ALL*

# Load modules
# (Do not use this in our workshop since we aren't using GPUs)
# (Note: loading the Python module is not necessary
# when you activate a Python virtual environment)
# module load cudacore/.10.1.243 cuda/10 cudnn/7.6.5

# Create a variable with the directory for your ML project
SOURCEDIR=~/<path/project/dir>

# Activate your Python virtual environment
source ~/env/bin/activate

# Transfer and extract data to a compute node
mkdir $SLURM_TMPDIR/data
tar xf ~/projects/def-<user>/<data>.tar -C $SLURM_TMPDIR/data

# Run your Python script on the data
python $SOURCEDIR/<script>.py $SLURM_TMPDIR/data

Notes:

  • %x will get replaced by the script name and %j by the job number
  • If you compressed your data with tar czf , you need to extract it with tar xzf
  • SBATCH options marked with a * are optional
  • There are various other options for email notifications
  • You may wonder why we transferred data to a compute node:

This makes any I/O operation involving your data a lot faster, so it will speed up your code.

So first, we create a temporary data directory in $SLURM_TMPDIR :

$ mkdir $SLURM_TMPDIR/data

  The variable $SLURM_TMPDIR is created by Slurm on the compute node where a job is running. Its path is /localscratch/<user>.<jobid>.0 . Anything in it gets deleted when the job is done.

Then we extract the data into it:

$ tar xf ~/projects/def-<user>/<data>.tar -C $SLURM_TMPDIR/data

If your data is not in a tar file, you can simply copy it to the compute node running your job:

$ cp -r ~/projects/def-<user>/<data> $SLURM_TMPDIR/data

Job handling

Submit a job

$ cd </dir/containing/job>
$ sbatch <jobscript>.sh

Check the status of your job(s)

$ sq

PD = pending
R = running
CG = completing (Slurm is doing the closing processes)
No information = your job has finished running

Cancel a job

$ scancel <jobid>

Display efficiency measures of a completed job

$ seff <jobid>

Comments & questions