R

Notitle

R is an interpreted computer language for statistical computing and graphics including: linear and generalised linear models, non-linear regression models, time series analysis, classical parametric and non-parametric tests.

For efficiency, procedures written in the C, C++, or FORTRAN languages may be interfaced within R in addition to the user-visible R functions.

R is available as a module on Apocrita.

Usage

To run the latest installed version of R, simply load the R module:

$ module load R
$ R --help

Usage: R [options] [< infile] [> outfile]
   or: R CMD command [arguments]

$ Rscript --help
Usage: /path/to/Rscript [--options] [-e expr [-e expr2 ...] | file] [args]

For further usage documentation, see the full output of R --help or the manual pages: man R and man Rscript.

R can be executed through the 'R CMD' interactive interface using R or via the Rscript non-interactive interface. The Rscript binary provides an alternative front end to the legacy R CMD BATCH method to run R commands in a non-interactive shell.

Example jobs

Do not use detectCores() for makeCluster()

detectCores() is not cluster aware and will incorrectly return all cores on a machine even if they are not actually available.

The documentation for R covers this:

"This is not suitable for use directly for ... specifying the number of cores in makeCluster.
First because it may return NA, second because it does not give the number of allowed cores"

We advise specifying the number of cores directly:

cluster <- makeCluster(4)

Serial job

Here is an example job running on 2 cores and 4GB total memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 2
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G

module load R
Rscript myscript.R

Using Rscript within Apocrita jobs

When running R jobs on Apocrita, use Rscript to submit a file containing R commands; R output will be written to the stdout and stderror streams.

Additional R Packages

Additional packages can be installed to extend the capabilities of R; Packages are stored in libraries.

  • To display the list of active libraries, execute:
> .libPaths()
[1] "/share/apps/centos7/R/<VERSION>/lib64/R/library"
  • To display the list of currently loaded packages (including default packages), execute:
> (.packages())
[1] "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"
[7] "base"
  • To display all packages available in the currently loaded libraries, execute:
> .packages(all.available = TRUE)
 [1] "base"       "boot"       "class"      "cluster"    "codetools"
 [6] "compiler"   "datasets"   "foreign"    "graphics"   "grDevices"
...
  • To install additional R packages, execute:
> install.packages("<package_name>")

Packages which are not included in the R main library will be added to the personal library.

Creating a personal library

If a personal library has not already been created, respond with y when prompted to create and use a personal library. Also select your closest Secure CRAN mirror.

  • To load an R package, execute:
> library("<package_name>")
  • To find the location where a package is installed, execute:
> find.package("<package_name>")
[1] "/path/to/package"

Bioinformatics packages for R

Bioinformatics packages for R can be installed with the bioconductor script, by executing the following commands:

source("https://bioconductor.org/biocLite.R")
biocLite("adegenet")
library(adegenet)

References