Infiniband is a high throughput, low latency node interconnect that allows Remote Direct Memory Access (RDMA) which can significantly improve the speed of large multi-node parallel jobs.
Apocrita has a number of infiniband enabled nodes, grouped into separate islands based on node type and network switch capacity:
|Nodes||Island name||Infiniband type|
The appropriate island can be selected for parallel jobs by adding the
-l infiniband=<island> parameter to your submission script.
An example of this setting is:
... #$ -l infiniband=sdv-i ...
Jobs across islands
Jobs are not scheduled across multiple islands as this would severely affect performance.
Larger infiniband jobs
If your jobs regularly require hundreds of parallel cores, please enquire about eligibility to use the Tier 2 services designed for larger jobs.