Miniforge¶
Miniforge is the
preferred installation method for Mamba and
conda-forge and includes conda and
mamba, along with their
dependencies. It contains a number of useful packages which make it popular
in fields like data science, machine learning and scientific computing.
Anaconda and Miniconda are no longer available on Apocrita due to licensing issues.
Miniforge is available as a module on Apocrita.
Usage¶
Loading the module¶
Python distribution module file conflicts
To prevent errors when running the Python interpreter, attempting to load an additional Python distribution module after loading a Miniforge module will produce a module load conflict error.
To run the default installed version of Miniforge, simply load the miniforge
module:
module load miniforge
You can then check your python version:
$ which python
/share/apps/rocky9/general/apps/miniforge/24.7.1/bin/python
$ python -V
Python 3.12.6
$OMP_NUM_THREADS
The variable OMP_NUM_THREADS
is set to the value of NSLOTS in serial jobs if the variable does not
already exist in the environment. You will see a confirmation message when
the module is loaded.
Conda package manager¶
In addition to the Conda package manager, Miniforge provides access to pip.
Mixing conda/mamba install and pip install
When using Conda, it is best to stick to only using conda install or
mamba install for all packages. Try not to mix conda/mamba install
and pip install if possible.
If you have no choice but to mix the two, please create your Conda environment specifying a version of Python, e.g.:
mamba create --quiet --yes --name myenv python=3.11.0
This will ensure that your environment has Python correctly setup and any
pip install commands will write those packages inside your activated
Conda environment. Failure to do this means they may end up in
${HOME}/.local/lib/<python version> which is highly likely to irreparably
break all personal Python environments, as well as stop
Jupyter Open OnDemand sessions from being able to
load.
Environments¶
When working with Miniforge you'll probably want to update installed packages as well as installing some new ones. Since Miniforge is installed to shared storage and you don't have write access there, you'll need to create Conda environments somewhere where you do have write access (for example scratch directories or lab shares). You can use your home directory for this but beware: you have limited storage there.
Python virtualenv environments can also be created using Miniforge because it also provides the Python programming language as well as Conda environments.
Listing environments¶
You can list existing Conda environments as follows:
$ mamba env list
# conda environments:
#
base /share/apps/rocky9/general/apps/miniforge/24.7.1
Let's set up a new location for environments; we do this by editing .condarc
which is in your home directory:
Do not use the defaults channel
Previous documentation may have led you to use the defaults channel in
your ~/.condarc file. This is actively discouraged in the
official Mamba documentation.
Instead, you should use only nodefaults, which will disable the
defaults channel and use only the conda-forge channel. Please do not
add other options such as channel_priority: flexible or
auto_activate_base: either.
$ cat ~/.condarc
channels:
- nodefaults
ssl_verify: true
## Optional - store Conda environments in an alternative location
envs_dirs:
- /data/scratch/abc123/anaconda/envs
pkgs_dirs:
- /data/scratch/abc123/anaconda/pkgs
Scratch space is auto-deleted
In the example above, environments are stored on scratch which is auto-deleted periodically.
If you know you need to keep your environments for longer than the auto-deletion period, you should store them somewhere like your home directory (being mindful that capacity there is limited) or, if available to you, a Research Group storage space.
Here we have specified /data/scratch/abc123/anaconda as a location to install
to.
Creating a new environment¶
Do not use a login node for creating Conda environments
Creating a Conda environment requires a reasonably large amount of time and
memory. Do not use a login node for creating a Conda environment. Please use
an interactive qlogin session.
Environment creation is single-core, so 1 core for 24 hours (1 hour is often not enough for more complex environments) with 8GB RAM is recommended:
qlogin -l h_vmem=8G -l h_rt=24:0:0
Let's create a new environment:
$ mamba create --quiet --yes --name myenv
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Here we have specified that we want a new environment called "myenv".
We've not requested any packages. Suppose we wanted a specific
point release of python installed. We could do:
$ mamba create --quiet --yes --name myenv python=3.11.0
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
Let's see which version of python we have in our $PATH:
$ which python
/share/apps/rocky9/general/apps/miniforge/24.7.1/bin/python
$ python -V
Python 3.12.6
Let's activate the new environment. The asterisk denotes the currently active environment:
$ mamba activate myenv
$ mamba env list
# conda environments:
#
myenv * /data/scratch/abc123/anaconda/envs/myenv
base /share/apps/rocky9/general/apps/miniforge/24.7.1
And let's check python again:
$ which python
/data/scratch/abc123/anaconda/envs/myenv/bin/python
$ python -V
Python 3.11.0
The following command will remove an unwanted environment, to save disk space:
$ mamba env remove -n myenv
Remove all packages in environment /data/scratch/abc123/anaconda/envs/myenv:
Everything found within the environment (/data/scratch/abc123/anaconda/envs/myenv),
including any conda environment configurations and any non-conda files, will be
deleted. Do you wish to continue?
(y/[n])? y
Installing packages using mamba¶
mamba and conda
The examples below use mamba,
which is an improved solver, but conda will also still work (and has
integrated the mamba solver since version 23.10.0).
We still recommend users use mamba where possible, it should be a drop-in
replacement for all conda commands.
You can install other packages from Conda into your environment.
Packages must be installed into activated environments
Packages must be installed into your personal environments. If package installs are attempted without first activating an environment, a permission error will be shown.
This example demonstrates installation of the scipy package:
$ mamba activate myenv
(myenv) $ mamba install --quiet --yes scipy
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done
(myenv)$ python -c "import scipy; print(scipy.__version__)"
1.14.1
The mamba list command will show all of the packages installed into your
environment.
Example jobs¶
Threading with OpenMP
Some Python packages use OpenMP for threading. For serial jobs, the
module sets the variable OMP_NUM_THREADS to be the number of requested
slots in the current job, if the variable is unset when loading the module.
This avoids an issue where some packages incorrectly use too many threads
for OpenMP work.
If you are using OpenMP with such packages or in your own code, you should
check that the OMP_NUM_THREADS variable has been set correctly, or
override this value manually, either before or after loading the module.
Serial job¶
Here is an example job running on 1 core:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
module load miniforge
mamba activate myenv
python example.py
Serial job demonstrating environment switching¶
This example shows that you can switch Conda environments in a job script:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
module load miniforge
mamba env list
mamba activate myenv
mamba env list
If we view the job output:
# conda environments:
#
myenv /data/scratch/abc123/anaconda/envs/myenv
base * /share/apps/rocky9/general/apps/miniforge/24.7.1
# conda environments:
#
myenv * /data/scratch/abc123/anaconda/envs/myenv
base /share/apps/rocky9/general/apps/miniforge/24.7.1
Here we can see from the output of the mamba env list commands,
that we are successfully switching environments inside a job script.