Skip to content

Cell Ranger

Cell Ranger is a set of analysis pipelines that process Chromium single cell RNA sequencing output to align reads, generate gene cell matrices and perform clustering and gene expression analysis.

Cell Ranger is available as a module on Apocrita.

Usage

To run the default installed version of Cell Ranger, simply load the cellranger module:

$ module load cellranger
$ cellranger

Usage:
    cellranger mkfastq
    cellranger count
    cellranger aggr
    cellranger reanalyze
    cellranger mkloupe
    cellranger mat2csv
    cellranger mkgtf
    cellranger mkref
    cellranger vdj
    cellranger mkvdjref
    cellranger testrun
    cellranger upload
    cellranger sitecheck

For more information regarding each analysis pipeline, pass the --help switch after the pipeline sub-command (i.e. cellranger count --help).

Operating modes

Cell Ranger can be run in different modes; The most relevant two for us are:

  • local (default)
  • sge

Local operating mode

The local mode will execute the pipeline on a single machine, usually within a cluster job. To run all executions inside a single job, this is the desired mode.

Cores and Memory Requests

To avoid overloading a node, set the --localcores ${NSLOTS} option to restrict Cell Ranger to use the specified number of cores to execute pipeline rather than all cores available.

To prevent your job from being killed by the scheduler for using more memory than requested, set the --localmem option to restrict Cell Ranger to use specified amount of memory (in GB) to execute pipeline stages rather than to use 90% of total memory available.

Multi-Job operating mode (sge)

The sge mode will launch each stage of the underlying Martian pipeline framework as a different Apocrita job using the qsub command. As jobs from each stage are queued, launched, and completed, the pipeline framework will track their states using the metadata files that each stage maintains in the pipeline output directory.

Core Requests and Consumption

The number of cores required is determined by the pipeline framework stage however, Martian jobs will run on only 1 core.

Memory Requests and Consumption

To specify the memory-per-core resource (h_vmem), pass the --mempercore switch. This value will be used in all pipeline framework stages.

Example jobs

Serial job (local)

Here is an example job running on 4 cores and 16GB of memory to generate single-cell gene counts for a single library, in local mode:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_rt=1:0:0
#$ -l h_vmem=4G

module load cellranger

cellranger count --id 1234 \
                 --transcriptome example_transcriptome \
                 --fastqs raw_data.fastq \
                 --sample 1234 \
                 --disable-ui \
                 --jobmode local \
                 --localcores ${NSLOTS} \
                 --localmem 4

Serial job (multi-job)

Here is an example job running on 1 core and 4GB of memory to generate single-cell gene counts for a single library. Each stage will be submitted as a different job with the same resources.

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load cellranger

cellranger count --id 1234 \
                 --transcriptome example_transcriptome \
                 --fastqs raw_data.fastq \
                 --sample 1234 \
                 --disable-ui \
                 --jobmode sge \
                 --mempercore 4

Reference