Containers

Notitle

Linux containers are self-contained execution environments that share a Linux kernel with the host, but have isolated resources for CPU, I/O, memory, etc. A container can run a completely different Linux environment, without the overhead required by virtual machines.

Benefits of containers

  • Reproducible science - containers can include an application and its dependencies.
  • Version independent - run code designed for other versions of Linux e.g. safely run legacy code.
  • Self-contained - allow isolation of complicated application installs.

Singularity

Singularity Logo Singularity is a container solution designed for HPC. Unlike other container solutions such as Docker, it allows utilisation of GPUs and Infiniband interconnects for MPI jobs, and does not allow privilege escalation within a container, which would compromise the security in a multi-user environment with a shared filesystem.

After researching the options, and running a pilot phase with users, singularity was the clear choice for the Apocrita HPC cluster.

Using singularity

To use a singularity container, first load the module:

module load singularity

You will have the opportunity to use containers provided by the Research ITS team, or build your own.


Building your own containers

Root privileges required for bootstrap step

Note that the bootstrap part of the build process requires elevated privileges, and therefore it is not possible to run this step on Apocrita.

Alternatively you can:

  • create the container on your own machine and copy to Apocrita.
  • create the container using a virtual machine e.g. provisioning a singularity Virtualbox VM using vagrant.
  • submit a container definition file for the research support team to provision for you

One primary task per container

HPC Containers are designed to perform one primary task, and should consist of a main application and its dependencies, in a similar way to how module files are provided. Since containers are lightweight, you can use separate containers instead of general purpose containers containing a collection of applications. This improves supportability, performance and reproducibility.

How to build a container

Using the following basic python3 def file, which we will call centos7-python-3.4.el7.def:

BootStrap: yum
OSVersion: 7
MirrorURL: https://www.mirrorservice.org/sites/
mirror.centos.org/%{OSVERSION}/os/$basearch/
UpdateURL: https://www.mirrorservice.org/sites/
mirror.centos.org/%{OSVERSION}/updates/$basearch/
Include: yum

%post
    yum -y install epel-release
    yum -y install python34

%runscript
    python3.4 $@

We first create an image:

singularity create centos7-python-3.4.el7.img

The create command without any options makes a default image of 768MB. This can be subsequently be expanded with the singularity expand command, or you can choose a larger initial size with the -s option to the create command.

Then we bootstrap the image (this step requires root privileges):

sudo singularity bootstrap ./centos7-python-3.4.el7.img \
./centos7-python-3.4.el7.def

This will result in an image that can then be used. Bear in mind that if you want a very specific version of package from a repository, that specific package may not be available in future, so where possible, try to future-proof your containers.

Future-proofing your containers

When building your own containers, be sure to make them portable and future-proof.

  • Consider whether the container will still build if the OS release version changes.
  • Don't rely on copying files from a working directory as part of setup
  • Perform all setup as part of the bootstrap process. If any manual steps are performed after the container is built, they should be integrated within the definition file, and the container rebuilt.
  • Consider if your ability to rebuild your container will be impacted by package updates, or deprecation of old releases.

Legacy versions of CentOS applications

Outdated minor CentOS releases are moved from the main CentOS servers to vault.centos.org. If you need to use a specific Operating System or application version other than the latest, you need to future-proof your container by using the CentOS vault.

Example definition file for using CentOS vault for CentOS 7.2.1511 and ImageMagic-6.7.8.9-15

BootStrap: yum
OSVersion: 7.2.1511
MirrorURL: http://vault.centos.org/%{OSVERSION}/os/$basearch
UpdateURL: http://vault.centos.org/%{OSVERSION}/updates/$basearch/
Include: yum

%post
    yum -y install ImageMagick-6.7.8.9-15.el7_2.x86_64

Runscripts

Singularity containers can be configured to run a command or script with the %runscript directive. This is configured during the bootstrap process, and creates a file called /singularity inside the container, allowing you to execute the container and pass your own parameters.

Note

Remember to include $@ on the runscript command in your definition file to pass the parameters to the application inside your container, as shown in the above example.

$  ./centos7-python-3.4.el7.img
Python 3.4.3 (default, Aug  9 2016, 17:10:39)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
$  ./centos7-python-3.4.el7.img ./hello_world.py
Hello, World!

You can also use the singularity run <container> command.

Test scripts

Adding a %test section at the end of your def file will allow you to automatically run a test as part of the bootstrap process. If you are testing on different hardware where the tests may fail, you can disable a %test section by providing the --notest option to the bootstrap command.

Other bootstrap options are documented on the Singularity web site.

Definition file repository

Research IT maintain a repository of the singularity definition files used to create the official containers, plus ones for testing. If you maintain your own container definition files, we encourage you to store them in a version control system such as the QMUL Github.


Using containers

Running commands inside a container

The exec command will allow you to execute any program within a given container. The run command performs the action defined by the %runscript section, which is the primary task of the container. Using the run command is the simpler approach for job submissions.

If you have the singularity module loaded, you can even "execute" a container, which performs the same action as the singularity run command:

In this example we have a container called pandoc.img, and a symbolic link called pandoc:

module load singularity
$ ls -lo pandoc
lrwxrwxrwx 1 user 10 Jun  9 11:19 pandoc -> pandoc.img
$ singularity inspect -r pandoc
#!/bin/sh
    pandoc $@
$ ./pandoc --version | head -1
pandoc 1.16.0.2

Execute a script from outside the container

Using on the cluster

For typical use, you want to use runscripts or exec commands, especially when submitting the work via Grid Engine.

The following example runs a python script from the current directory.

$ singularity exec ./centos7-python-3.4.el7.img python hello_world.py
Hello, World!

Note that home directories and current directories are seamlessly available from within the container.

Using with Grid Engine

One of the major benefits of Singularity is the simplicity with which it can be used in an HPC environment. Your Grid Engine qsub file needs only to load the singularity module and you can run your container. Bear in mind that the RAM and runtime might be a little higher than native code.

Simple example

Example:

#$ -cwd
#$ -S /bin/bash
#$ -j y
#$ -pe smp 1
#$ -l h_vmem=2G

### create a PDF from a markdown file called example.md
### using pandoc inside a container
module load singularity
singularity run /data/containers/pandoc.img example.md -o output.pdf

More complex example

First a container was built to run CentOS6 and python 2.7. CentOS6 uses python 2.6.6, and python2.7 is only available via the CentOS Software Collections. To run these different versions we need to use a special command scl enable.

Example container definition file, centos6.def

BootStrap: yum
OSVersion: 6
MirrorURL: https://www.mirrorservice.org/sites/
mirror.centos.org/%{OSVERSION}/os/$basearch/
UpdateURL: https://www.mirrorservice.org/sites/
mirror.centos.org/%{OSVERSION}/updates/$basearch/
Include: yum

%post
    yum -y install centos-release-SCL
    yum -y install python27

%runscript
    scl enable python27 "python2.7 $@"

After creating the container, we can use it in a scheduler job, using the following submission script:

#!/bin/bash
#$ -cwd
#$ -l h_rt=12:0:0
#$ -l h_vmem=4G
module load singularity
singularity exec centos6.img scl enable \
python27 "python python_prime.py"

Alternatively, because of our runscript, we could have used the singularity run python_prime.py feature.

Shell access to the container

We can open an interactive shell within the container. This can be useful for debugging container bootstraps or checking how a container is built.

$ singularity shell ./centos7-python-3.4.el7.img
Singularity: Invoking an interactive shell within container...

Singularity.centos7-python-3.4.el7.img> python3.4

Python 3.4.3 (default, Aug  9 2016, 17:10:39)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

Officially supported containers

Officially supported containers are stored in /data/containers and are supported in a similar way to the globally available supported applications. For some applications provided as a module file and a container, you may not even realise you are using a container, as use of the application will be seamless.

Running containers from external sources

Note

For security, and reproducibility, we advise that you not rely on externally-sourced containers for performing your research.

Containers created elsewhere can be copied or imported, and run on the cluster. In addition to copying container images around manually, you can import or run remote containers as a proof-of-concept.

Singularity hub will build a container specified by a github repository and allow you to pull an image, or run, exec and shell into an image.

In this example, a Github repository containing a Singularity definition file has been added to Singularity Hub, and a container has automatically been created. This allows rapid prototyping and development of containers.

singularity exec shub://sbutcher/container-R Rscript test.R

Documentation is available on the Singularity Hub Wiki

You can also import Docker images via the Remote Docker API, for ad-hoc development and testing of your containers only (do not use these for actual research). For example, to import the container tagged latest from Ubuntu in the docker hub:

singularity create ubuntu.img
singularity import ubuntu.img docker://ubuntu:latest

Inspecting a container

You can find out how a container was created with the inspect command. The -d option shows the definition file used to create the container.

$ singularity inspect /data/containers/pandoc.img
BootStrap: debootstrap
OSVersion: xenial
MirrorURL: http://archive.ubuntu.com/ubuntu/

%post
    echo "Hello from inside the container"
    sed -i 's/$/ universe/' /etc/apt/sources.list
    apt-get update
    apt-get install --yes wget pandoc texlive \
    texlive-latex-recommended texlive-xetex texlive-luatex pandoc-citeproc
    apt-get clean

%runscript
    pandoc $@

$test
    pandoc --version

The singularity inspect help command provides additional options for inspecting the container.

Known Issues

Bootstrapping CentOS on Ubuntu

Trying to bootstrap a CentOS container on an Ubuntu machine fails with an error RPM database is using a weird path

To fix this: Create a file /root/.rpmmacros containing the following and try again:

%_var /var
%_dbpath %{_var}/lib/rpm

References

Singularity Documentation

Singularity Hub

Running singularity help (and nested commands such as singularity bootstrap help) also provides additional help.