Skip to content

Compiling Code

Available compilers

A number of compiler suites are available on the systems:

  • GCC
  • Intel (part of Intel Parallel Studio XE)
  • PGI Community Edition

Modules are available to set up the user environment giving access to these compilers. One version of the GCC compilers will be available without loading a module although this is typically a much earlier version than offered through the module system.

GCC will give reliable results. However, depending on your code and libraries, the Intel or PGI compilers may provide considerable performance improvements.

Compilation should be performed as job submissions or interactively via qlogin in order not to impact the frontend nodes for other users.

This page focuses on the C, C++ and Fortran languages which are the most common compiled languages in use on the cluster.

Loading a compiler module

It is generally a good idea to be specific with your compiler version. Check which modules you have loaded to be sure you have the right compiler and that there are no conflicts. The available compiler versions can be viewed in the devtools section of the output of the module avail command.

Check the available versions for the GCC compiler suite:

$ module avail gcc
gcc/6.3.0          gcc/7.1.0(default)

Intel compiler version 2017.3 can be loaded with the command

module load intel/2017.3

You can test this by typing the command:

icc -V

This should return a short message reporting the compiler version:

Intel(R) C Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 17.0.3.191 Build 20170404
Copyright (C) 1985-2017 Intel Corporation.  All rights reserved.

Often, you will require other libraries and headers that can be found in other modules. Unlike modules which provide many programs and tools, these library modules may have versions specific to a particular compiler suite. For example, for OpenMPI:

$ module avail openmpi
openmpi/2.0.2-gcc          openmpi/2.1.0-gcc          openmpi/2.1.1-intel        openmpi/3.0.0-gcc(default)

Modules for use with a single compiler suite have an indicating suffix, such as the -gcc and -intel seen here. For example, to use OpenMPI with the Intel compiler suite we would load modules as:

module load intel/2017.3 openmpi/2.1.1-intel

Again, check your loaded modules with

module list

If you don't specify a particular version, the version marked as default in the output of module avail command will be loaded.

Using the compilers

Each of the compiler suites provides a C, C++ and a Fortran compiler. The name of the compiler command varies with the language and the compiler suite. For convenience the compiler suite modules set consistent environment variables by which the compilers may be referenced. The compiler names and variables are given in the following table:

Language Variable GCC Intel PGI
C CC gcc icc pgcc
C++ CXX g++ icpc pgc++
Fortran FC gfortran ifort pgfortran

Compilation for specific nodes

Different processors support different instruction sets which may provide a performance boost. During compilation the target instruction sets can be selected via CPU architecture flags. The required flags vary by compiler suite and are detailed below.

Supported Instruction Sets

Some instruction sets are not supported on all nodes, so you may need to compile different binaries for each node type.

Please see the table here for details on supported instruction sets.

Checking available CPU flags on a node

The CPU flags, including details of supported instruction sets, are listed in the file /proc/cpuinfo on each Apocrita node. This file has a line labelled flags.

For example, checking for all available CPU flags and supported instruction sets on an sm node:

[sm0.apocrita ~]$ grep flags /proc/cpuinfo | uniq

Checking for SSE2 instruction set support on an sm node:

[sm0.apocrita ~]$ awk '/flags.*sse2/ {print "sse2 supported";exit}' /proc/cpuinfo
sse2 supported

Checking for AVX2 instruction set support on an sm node:

[sm0.apocrita ~]$ awk '/flags.*avx2/ {print "avx2 supported";exit}' /proc/cpuinfo
[sm0.apocrita ~]$

In this example the CPUs on the sm node do not support the AVX2 extensions: applications compiled targeting these may not run on sm Apocrita nodes.

Selecting instruction sets during compilation

The instruction sets which may be targeted by the compiler can be selected using a compile-time flag. When using the GCC compilers (gcc, g++ or gfortran) the flag -march=<cpu_flag> targets the instruction set of the given CPU type. When using this option the compiler may generate code which will not run on other CPU types. Notably, the option -march=native targets the instruction set of the CPU type running the compiler. This targeting may be disabled using the option -mno-<cpu_flag> when compiling code.

To see what the GCC compiler will do with the -march=native option you can use:

gcc -march=native -Q --help=target

Alternatively, the option -mtune=<cpu_flag> asks the compiler to tune the produced code to the given CPU type without restricting the instruction set to that of CPUs of that type.

The Intel compilers, as well as having -march (albeit with different semantics), has a flag -xHost which requests targeting of the highest instruction set available on the CPU type on which the compiler runs.

Processor incompatibilities with targeted code

Code produced with these options should provide a performance boost, but it is important to note that code optimised for a certain architecture may not run on other nodes, due to AMD/Intel differences, or lack of a certain feature in older processors.

You will need to build the code on the same type of node you will be executing on (via qsub or qlogin session) to use the relevant processor optimisation.

The PGI compilers do not offer a -march option. Instead, the option -tp=<target> is to be used.

More information on the compiler specific architecture flags is available in the vendor documentation:

Build systems

Typically, software for Linux comes with a build system with one of two flavours: GNU Makefiles and CMake

GNU Makefiles are more common. The general steps are as follows:

./configure
make

First one runs the configure command. This creates a Makefile. One then runs the make command that reads the Makefile and calls the necessary compilers, linkers and such.

CMake is similar but creates Makefiles, Visual Studio Projects, OSX XCode Projects and more. Such projects can be identified by the presence of a CMakeList.txt file. For these interested in building their own software, CMake is recommended over GNU Makefiles.

One major advantage of CMake is it allows out-of-source builds. Put another way, one can create a binary and all its associated support files in a directory that is not the same as the one with the source files. This can be quite advantageous when working with a source management tool like Git or SVN.

To work with CMake, start with creating a directory at the same level as the source code. For example...

$ pwd
/data/home/abc123/MySourceCode
$ mkdir MySourceCode_build
$ cd MySourceCode_build
$ cmake ../MySourceCode

Essentially, you enter the build directory and call cmake with the path to your CMakeList.txt file.

If you wish to configure your build, you can use the program ccmake

The end result, on Linux, is another Makefile. So to complete your build you type:

make

just as you would with the GNU Makefile setup. Under another OS such as Windows or OSX, Cmake would create a corresponding build file like a Visual Studio Project or similar

To learn more about GNU Makefiles, CMake, follow the links below

Optional libraries for HPC

MPI

The Message Passing Interface is a protocol for parallel computation often used in HPC applications. On Apocrita we have the distinct implementations IntelMPI and OpenMPI available. For general use we recommend the use of IntelMPI where suitable.

The module system allows the user to select the implementation of MPI to be used, and the version. With OpenMPI, as noted above, one must be careful to load a module compatible with the compiler suite being used.

To load the default (usually latest) IntelMPI module:

module load intelmpi

To set up the OpenMPI environment, version 3.0.0, suitable for use with the GCC compiler suite:

module load openmpi/3.0.0-gcc

For each implementation, several versions may be available. The default version is usually set to the latest release: an explicit version number is required to load a different version.

Default module for OpenMPI

The OpenMPI modules have a default loaded following the command module load openmpi which is openmpi/3.0.0-gcc. This default module is specific to the GCC compiler suite and so to access an MPI implementation compatible with a different compiler suite a specific module name must be specified.

To build a program using MPI it is necessary for the compiler and linker to be able to find the header and library files. As a convenience, the MPI environment provides wrapper scripts to the compiler, each of which sets the appropriate flags for the compiler. The name of each wrapper script depends on the implementation and the target compiler.

OpenMPI

For each OpenMPI module, and the implementation provided by the PGI compiler suite module, the wrapper scripts are consistently named for each language. These are given in the table below:

Language Script
C mpicc
C++ mpic++
Fortran mpif90

As an example, a Fortran MPI program may be compiled as

module load openmpi/3.0.0-gcc
mpif90 -o hello hello.f90

rather than requiring the addition of numerous include and linker path flags:

gfortran -o hello hello.f90 -I... -L... -l...

The OpenMPI wrapper scripts provide an option -show which details the final invocation of the compiler:

$ module load openmpi/3.0.0-gcc
$ mpif90 -show -o hello hello.f90
gfortran -o hello hello.f90 ...

No OpenMPI module is provided for use with the PGI compiler suite. Instead, the installed PGI compiler environment provides an OpenMPI implementation and the PGI compiler module contains the appropriate settings:

$ module purge; module load pgi/17.10
$ type mpif90
mpif90 is /share/apps/centos7/pgi/2017-17.10/linux86-64/17.10/mpi/openmpi-2.1.2/bin/mpif90

IntelMPI

In contrast, the IntelMPI implementation supports both the Intel and GCC compiler suites in the same module. As with OpenMPI wrapper scripts are provided, but these wrapper script names depend on the target compiler suite as well as the language. The wrapper script names are as in the following table:

Language Compiler suite Script
C GCC mpicc
C Intel mpiicc
C++ GCC mpic++
C++ Intel mpiicpc
Fortran GCC mpif90
Fortran Intel mpiifort

The scripts can be used as in the OpenMPI example above:

$ module load intelmpi
$ mpif90 -show -o hello hello.f90
gfortran -o 'hello' 'hello.f90' ...
$ mpiifort -show -o hello hello.f90
ifort -o 'hello' 'hello.f90' ...

There is no support for the PGI compilers in the IntelMPI implementation.

Compiling and testing

If make succeeds, you should see various calls being printed on your screen with the name of the compiler you chose. If compilation completed successfully you should see a success message of some kind, and an executable appear in your source or build directory.

Quite often, software comes with test programs you can also build. Often, the command to do this looks like the following:

make test

Optimisation

Software optimisation comes in many forms, such as compiler optimisation, using alternate libraries, removing bottlenecks from code, algorithmic improvements, and using parallelisation. Using processor-specific compiler options may reduce universal compatibility of your compiled code, but could yield substantial improvements.

The Intel, PGI and GCC compilers may give different performance depending on different libraries or processor optimisation. Benchmarking and comparing code compiled with each compiler is recommended.

Profiling tools

Once you have a running program that has been tested, there are several tools you can use to check the performance of your code. Some of these you can use on the cluster and some you can use on your own desktop machine.

perf

perf is a tool that creates a log of where your program spends its time. The report can be used as a guide to see where you need to focus your time when optimising code. Once the program has been compiled, it should be run through the record subcommand of perf:

perf record -a -g my_program

where my_program is the name of the program to be profiled. Once the program run a log file is generated. This log file may be analysed with the report subcommand of perf. For eaxmple, to display the function calls in order of the most called:

perf report --sort comm,dso

More information on perf can be found at http://www.pixelbeat.org/programming/profiling and this extensive tutorial

valgrind

valgrind is a suite of tools that allow you to improve the speed and reduce the memory usage of your programs. An example command would be:

valgrind --tool=memcheck <myprogram>

Valgrind is well suited for multi-threaded applications, but may not be suitable for longer running applications due to the slowdown incurred by the profiled application. In addition, there is a graphical tool which is not offered on the cluster but will work on Linux desktops. There is also an extensive manual.

cProfile

The above tools work best for compiled binaries. If you are writing code in Python, cProfile is one useful option.