Skip to content

Tier 2 HPC facilities

We have access to a number of Tier 2 clusters which are among the TOP500 list of the world's most powerful computer systems. If you are running multi-node parallel jobs you may benefit from access to these, please contact us to see if your job is appropriate and to organise access.

Who can access these clusters?

QMUL Academics may apply to use a Tier 2 cluster free of charge if:

  • they are performing predominantly EPSRC-funded research
  • jobs are an appropriate size for the Tier 2 service (the QMUL Apocrita service is sufficient for many users); these will usually be parallel jobs running over multiple nodes
  • jobs are well-tested and known to run successfully
  • the scope and size of the work is stated in advance as part of an application to use the cluster (or by using the initial 50,000 Core Hour allocation to determine specific resource requirements)
  • the work fits the designated application areas for the cluster
  • they notify us if they don't think they will be able to use the Core Hour allocation within the agreed time frame (so that the unused hours can be allocated to other users)
  • they provide a brief description of their research work to be performed on the cluster and agree to our sharing that with consortium partners for reporting purposes

Once your access has been agreed and set up, it is recommended that you connect to the Tier 2 resources through Apocrita. This is because access from the internet at large may be restricted to some of these resources. If you understand ssh client configuration the following example may be useful:

Host athena
        User abc123

If you require help porting your code to a Tier 2 cluster, which may use different jobs scheduling systems or software toolchains from Apocrita, please contact us.

Project allocations

Typically, new projects will be granted 50,000 Core Hours for benchmarking and job sizing. After this, to obtain resource allocation for your project, you will need to provide a short description of your project, along with job sizes and a commitment to use the resources within the agreed time-frame.

QMUL receive an allocation to use within a given accounting period on each cluster, which is divided among the various projects according to their requirements. At the end of each accounting period the balances are reset.

Core Hours

A Core Hour is the amount of work done by a single processor core in one hour. For accounting purposes, you need to calculate the cumulative total over all cores that your job runs on. If your job runs for one hour on ten 24-core nodes, the CPU time used is 240 Core Hours. Part-used nodes are counted as using all of the cores, since jobs are granted exclusive access to nodes.

Additional Core Hour allocations can be requested by contacting us with your requirements. Please only request what you will realistically use within the reporting period - we can always top up your allocation later if required.

Young - Hub in Materials and Molecular Modelling

Host Institution Physical Cores Nodes RAM/Node Scheduler Wallclock Accounting period
UCL 46,536 582 188GB+ SGE 48hrs 3 months

Young has an optional Hyperthreading feature

Hyperthreading lets you use two virtual cores instead of one physical core (some programs can take advantage of this) which can be enabled on a per job basis - the default is to use one thread per core as normal. See the Young Hyperthreading documentation for further information.

The CPU hour charging model on Young is different to other Tier 2 clusters in that a job requires twice the number of CPU hours per core/hour (i.e. 80 CPU hours are required to run a 40-core job for 1 hour).

Young has three types of node:

  • standard nodes
  • high memory nodes
  • large memory nodes

See the Young Node types page for more information about available nodes.

Young is funded by EP/P020194/1 and EP/T022213/1 and is designed for materials and molecular modelling. QMUL receive 10 Million Core Hours for each 3 month accounting period - ensure your request covers the CPU hour charging model as described above.

To acknowledge use of Young, please use the statement provided. Please see the UCL Young Software page for information regarding available software and example job scripts.

Users are given a 250GB quota, which is shared across the home and scratch spaces. Run lquota to display the current disk usage. The maximum job size is 5120 cores; typical job sizes are between 2-5 nodes.

Jobs on Young do not share nodes

If you do not request all the available cores, your job will consume up an entire node and no other jobs can run on it, but some of the cores will idle and you may be charged for the entire node usage.

Young nodes are diskless

Young nodes have no local hard drives meaning there is no $TMPDIR available, so you should not request -l tmpfs=XG in your job scripts or your job will be rejected at submit time.

To request an account on Young, please contact us and provide the following information:

  • First name
  • Surname
  • QMUL username
  • Public SSH key (not the private key)
  • Software Required
  • Short description of research goals using Young

A public SSH key is required because Young does not accept password logins. We request that you create a new SSH key-pair for Young rather than re-use any keys used to access Apocrita. Instructions for how to generate an SSH key-pair are available here.

Athena - HPC Midlands Plus

Host Institution Cores Nodes RAM/Node Scheduler Wallclock Accounting Period
Loughborough 14,336 512 128GB Slurm 100hrs 6 months

Athena is funded by EP/P020232/1 To acknowledge use of Athena, please use the statement provided.

QMUL receive 7.4 Million Core Hours for each 6 month accounting period on Athena.

To apply for an account on Athena, please use the SAFE system. Once you have requested an account, please contact us with a brief description of the work you will perform using Athena.

Documentation is available on the Midlands Plus pages, including a Quick Start guide.

JADE - Joint Academic Data science Endeavour

Host Institution Nodes GPUs per node Scheduler EPSRC Grant
Oxford 22 NVidia DGX-1 8 Nvidia V100 Slurm EP/P020275/1

JADE is a GPU cluster designed for machine learning and molecular dynamics applications. The Nvidia DGX-1 nodes run optimised versions of Caffe, Tensorflow, Theano and Torch for machine learning. More information is available in the JADE documentation.

To request an account on JADE, please create an account providing an institutional email address, and the public portion of your ssh key. An email will be sent to you containing a password for SAFE which needs to be changed on first logging in. Once you have signed up, log on to SAFE and click on "Request Join Project". From the drop-down list choose JAD007 and enter the Project signup password which, for this Project, is jadqmul17, and click "Request".

Before we can allocate your initial default resource allocation, please contact us with a brief description of the work you will perform using JADE.

To acknowledge use of JADE, please use a statement like the following:

This project made use of time on Tier 2 HPC facility JADE, funded by
EPSRC (EP/P020275/1).