Skip to content

Using GPUs

The nxg and sbg nodes contain GPU cards that can provide huge acceleration for certain types of parallel computing tasks, via the CUDA and OpenCL frameworks.

Access to GPU nodes

Access to GPU nodes is available for free to QM Researchers. Please contact us if you would like to use these nodes so we can add you to the allowed user list and help you get started with your initial GPU job submission. Note that access to GPU nodes is not permitted for Undergraduate and MSc students, or external users.

Applications with GPU support

srif.png

There is a considerable number of scientific and analytical applications with GPU support. While some have GPU support out of the box, such as Matlab and Ansys, others may require specific GPU-ready builds. These may appear in the module avail list with a -gpu suffix. If you require GPU support adding to a specific application, please submit a request for a GPU build and provide some test data.

Be aware that not every GPU-capable application will run faster on a GPU for your code. For example, CP2K only has a GPU port of the DBCSR sparse matrix library. If you are not using this library in your code then you will not experience a performance boost.

Submitting jobs to GPU nodes

To request a GPU the -l gpu=<count> option should be used in your job submission, and the scheduler will automatically select a GPU node. Note that requests are handled per node, so a request for 64 cores and 2 GPUs will result in 4 GPUs across two nodes. Examples are shown below.

Selecting a specific GPU type

For compatibility, you may optionally require a specific GPU type. For example, CUDA version 8 predates the V100 GPU, and is not supported, so -l gpu_type=kepler would select nodes using the K80 GPU instead. Conversely, nodes with the Volta V100 GPU may be selected with -l gpu_type=volta, and Ampere A100 nodes may be selected with -l gpu_type=ampere.

GPU card allocation

Do not set the CUDA_VISIBLE_DEVICES variable

For reasons documented below, please do not set the CUDA_VISIBLE_DEVICES variable in your job scripts.

Requesting cards with parallel PE

If using the parallel parallel environment requests will be exclusive, please ensure that you correctly set slots and gpu to fill the node.

We have enabled GPU device cgroups (Linux Control Groups) across all GPU nodes on Apocrita, which means your job will only be presented the gpu cards which have been allocated by the scheduler, to prevent some applications from attaching to GPUs which have not been allocated to the job.

Previously, it was required to set the CUDA_VISIBLE_DEVICES variable in job scripts to ensure the correct GPU is used in the job. However, this was a workaround until the GPU device cgroups were applied. You should no longer set this variable in your job scripts.

Inside your job, the GPU cards presented to your job will always appear as device 0 to device N, depending on how many GPU cards you have requested. Below demonstrates the devices presented to jobs, per GPU resources request:

GPUs Requested Devices Presented
1 0
2 0, 1
3 0 - 2
4 0 - 3

Checking GPU usage

GPU usage can be checked with the nvidia-smi command e.g.:

$ nvidia-smi -l 1
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  On   | 00000000:06:00.0 Off |                    0 |
| N/A   70C    P0   223W / 250W |  12921MiB / 16160MiB |     97%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  On   | 00000000:2F:00.0 Off |                    0 |
| N/A   30C    P0    23W / 250W |      4MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   2  Tesla V100-PCIE...  On   | 00000000:86:00.0 Off |                    0 |
| N/A   30C    P0    23W / 250W |      6MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   3  Tesla V100-PCIE...  On   | 00000000:D8:00.0 Off |                    0 |
| N/A   31C    P0    23W / 250W |      6MiB / 32510MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1557      C   python                          12915MiB |
+-----------------------------------------------------------------------------+

In this example we can see that the process is using GPU 0. We use the -l 1 option which tells nvidia-smi to repeatedly output the status. Should this be run inside a job, GPU 0 would be the card you have been allocated, which might not be system device 0.

If you SSH into a GPU node and run nvidia-smi, you will see all system GPU devices by their real ID, rather than the allocated device number. Similarly, the SGE_HGR_gpu environment variable inside jobs and the qstat -j JOB_ID command will also show the actual GPU device granted.

Example job submissions

The following examples show the basic outline of job scripts for GPU nodes. Note that while the general rule for compute nodes is to strictly request only the cores and RAM you will be using, our GPU jobs are GPU-centric: request only the GPUs you will be using, but select 8 cores per GPU, and 7.5GB per core. You can increase this to 11GB if you are sure that your job will not use an nxg node (e.g. by using -l gpu_type='volta|ampere'), as these have less RAM than sbg nodes. More detailed examples can also be found on the application-specific pages on this site (e.g. TensorFlow)

h_vmem does not need to account for GPU RAM

The h_vmem request only refers to the system RAM, and does not need to account for GPU RAM used. The full GPU RAM is automatically granted when you request a GPU

Request one GPU

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 8        # 8 cores (32 per gpu node)
#$ -l h_rt=240:0:0  # 240 hours runtime
#$ -l h_vmem=7.5G   # 7.5 * 8 = 60G
#$ -l gpu=1         # request 1 GPU per host

./run_code.sh

Request two GPUs on the same box

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 16       # 16 cores (32 per gpu node)
#$ -l h_rt=240:0:0  # 240 hours runtime
#$ -l h_vmem=7.5G   # 7.5 * 16 = 120G
#$ -l gpu=2         # request 2 gpu per host

./run_code.sh

Request all GPUs on a node with 4 GPUs

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 32       # 32 cores (32 per gpu node)
#$ -l h_rt=240:0:0  # 240 hours runtime
#$ -l gpu=4         # request 4 gpu per host
#$ -l exclusive     # request exclusive access to the node

./run_code.sh

If you are requesting all GPUs on a node, then choosing exclusive mode will give you access to all of the resources. Note that requesting a whole node will likely result in a long queueing time, unless you have access to an "owned" node that your research group has purchased.

Submitting jobs to an owned node

If your research group has purchased a node, the scheduler default action will be to check both the main GPU nodes and any owned GPU nodes for available slots. If you want to restrict your job to your owned nodes only (e.g. for performance, or to ensure consistency), then adding:

#$ -l owned

to the resource request section at the top of your job script will restrict the job to running on owned nodes only.

Getting help

If you are unsure about how to configure your GPU jobs, please contact us for assistance.