Access to the training cluster
This course is run in a training cluster. You access it with the secure shell command.
Open a terminal emulator:
MobaXTerm (install the free version)
Linux: xterm or the terminal emulator of your choice
In it, using the user name and password that you received in our first Zoom session, type:
$ ssh firstname.lastname@example.org # enter password (it is blind typing, so you won't see it as you type)
You are now in our training cluster.
Load necessary modules
To use Python for ML on the cluster, you will need to load the relevant modules.
# Get help on the module command $ module help # List modules that are already loaded $ module list # See which modules are available for Python $ module avail python # Load the module for Python version 3.8.3 $ module load python/3.8.2
At this point, we do not have GPUs in our training cluster (we will soon!). For this course, the Python module is thus the only one you need to load. When working on the Compute Canada clusters however, in order to use GPUs, you will also need to load the modules NVIDIA CUDA Deep Neural Network library—a GPU-accelerated library of primitives for deep neural networks)., possibly (depending on which module you are loading), and (
Install the necessary Python wheels in a virtual environment
You also need Python packages.
For this, create a virtual environment in which you will install packages with.
Do not use Anaconda
While Anaconda is a great tool on personal computers, it is not an appropriate tool when working on the Compute Canada clusters: binaries are unoptimized for those clusters and library paths are inconsistent with their architecture. Anaconda installs packages in where it creates a very large number of small files. It can also create conflicts by modifying .
Create a virtual environment:
$ virtualenv --no-download ~/env
Activate your virtual environment:
$ source ~/env/bin/activate
(env) $ pip install --no-index --upgrade pip
Install the packages you need in the virtual environment:
(env) $ pip install --no-index matplotlib torch torchvision tensorboard
If you want to exit the virtual environment, run:
(env) $ deactivate
If you have issues accessing the training cluster or installing the python packages in a virtual environment, please join the debug session where we will help you getting up and running.