Skip to content

BUSCO

Benchmarking Universal Single-Copy Orthologs (BUSCO) provides quantitative measures for the assessment of genome assembly, gene set, and transcriptomes.

Installation

Conda installation

BUSCO can be installed from the Bioconda Anaconda channel by loading the Miniforge module and then creating a Conda environment and installing BUSCO into it (output below truncated):

$ module load miniforge
$ mamba create --quiet --yes --name busco_env
$ mamba activate busco_env
(busco_env) $ mamba install bioconda::busco

Looking for: ['bioconda::busco']
...
  Updating specs:

   - bioconda::busco
...
Confirm changes: [Y/n] Y
...
Downloading and Extracting Packages:

Preparing transaction: done
Verifying transaction: done
Executing transaction: done

Docker container via Apptainer

BUSCO is also available as a pre-built Docker container on Docker Hub:

https://hub.docker.com/r/ezlabgva/busco

Please follow the instructions for running containers from external sources to pull it using a specific release tag (NB - there is no latest tag at the time of writing), e.g.:

$ APPTAINERENV_NSLOTS=${NSLOTS} apptainer pull docker://ezlabgva/busco:v5.8.2_cv1
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
...
INFO:    Creating SIF file...

This will save the container as a *.sif file in the current working directory:

$ ls *.sif
busco_v5.8.2_cv1.sif

Usage

Conda usage

To run the Conda installed version of BUSCO, simply load the miniforge module and activate the Conda environment you installed it into:

$ module load miniforge
$ mamba activate busco_env
(busco_env) $ busco --help
usage: busco -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]

Apptainer container usage

To run an Apptainer container of BUSCO, use the apptainer run command (using the container's full path if it isn't in the current directory):

$ apptainer run /path/to/busco_v5.8.2_cv1.sif busco --help
usage: busco -i [SEQUENCE_FILE] -l [LINEAGE] -o [OUTPUT_NAME] -m [MODE] [OTHER OPTIONS]

Example jobs

Serial jobs

Here is an example job running on 1 core and 1GB of memory:

Conda installation - single core

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load miniforge
mamba activate busco_env

busco -i [SEQUENCE_FILE] \
      -m [MODE] \
      [OTHER OPTIONS]

Apptainer container - single core

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

apptainer run \
    /path/to/busco_v5.8.2_cv1.sif \
        busco -i [SEQUENCE_FILE] \
        -m [MODE] \
        [OTHER OPTIONS]

Here is an example job running on 4 cores and 4GB of memory:

Core Utilisation

Use the -c ${NSLOTS} flag to ensure you use the correct number of cores for your job via the $NSLOTS environment variable:

-c N, --cpu N         Specify the number (N=integer) of threads/cores to use.

Conda installation - multi-core

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load miniforge
mamba activate busco_env

busco -i [SEQUENCE_FILE] \
      -m [MODE] \
      -c ${NSLOTS} \
      [OTHER OPTIONS]

Apptainer container - multi-core

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

apptainer run \
    /path/to/busco_v5.8.2_cv1.sif \
        busco -i [SEQUENCE_FILE] \
        -m [MODE] \
        -c ${NSLOTS} \
        [OTHER OPTIONS]

References