Skip to content

Stata

Stata is a general purpose statistical software package, that provides data management, statistical analysis, graphics, simulations, regression, and custom programming.

Stata is available as a module on Apocrita.

Usage

To run the latest installed version of Stata, simply load the stata module:

$ module load stata
$ stata -h
stata-mp:  usage:  stata-mp [-h -q -s -b] ["stata command"]
        where:
             -h      show this display
             -q      suppress logo, initialization messages
             -s      "batch" mode creating .smcl log
             -b      "batch" mode creating .log file

Core Usage

To ensure that Stata uses the correct number of cores, the set processors command should be used. e.g.

$ stata
. set processors 4
The maximum number of processors or cores being used is changed from 8 to 1.
It can be set to any number between 1 and 8

Licensing

When running Stata, you can request a licence using the -l stata (Stata 13) or -l stata15 (Stata 15) scheduler complex.

Stata 13

The Stata/MP perpetual licence on Apocrita for Stata 13 allows for up to 8 cores per process and 6 concurrent users.

Stata 15

The Stata/MP perpetual licence on Apocrita for Stata 15 allows for up to 16 cores per process and 11 concurrent users.

Example jobs

License complex

You need to add the correct license complex to all Stata jobs. See here for more information.

Interactive jobs

Interactive mode can be useful for diagnosing issues with your Stata code.

Stata 13 Interactive Example

Run qlogin then load the module and run the binary:

$ qlogin -l stata
$ module load stata/13
$ stata-mp
. set processors 1

Stata 15 Interactive Example

Run qlogin then load the module and run the binary:

$ qlogin -l stata15
$ module load stata/15
$ stata-mp
. set processors 1

Serial job (batch mode)

Batch mode preferred

HPC clusters are designed to run queued jobs via the command line, which is the preferred method of execution over Interactive jobs.

Stata 13 Batch Example

Here is a Stata 13 example job running on 4 cores and 4GB of total memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_vmem=1G
#$ -l h_rt=1:0:0
#$ -l stata

module load stata/13

# Run the Stata code example.do
stata-mp -b do example.do ${NSLOTS}

Stata 15 Batch Example

Here is a Stata 15 example job running on 4 cores and 4GB of total memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_vmem=1G
#$ -l h_rt=1:0:0
#$ -l stata15

module load stata/15

# Run the Stata code example.do
stata-mp -b do example.do ${NSLOTS}

Example Input File

A sample example.do file, using data from the Stata 15 release (also works in Stata 13):

args ncores
set processors `ncores'

use http://www.stata-press.com/data/r15/census5
tabulate region
summarize marriage_rate divorce_rate median_age if state!="Nevada"

Using variables in Stata

Note that ncores and other Stata variables are preceded by a single back tick ` and appended by a single quote '.

Custom Stata commands using "ado" files

Users can extend the Stata programming language by writing custom commands. These are saved as ado files. Stata checks the ado files when a non-system Stata command is executed.

Use the sysdir command to list where Stata searches for ado files. On Apocrita, the PERSONAL directory is expected at ~/ado/personal/, which needs to exist. If the directory does not exist, execute:

mkdir -p ~/ado/personal

You can then save your "ado" files in that folder and Stata should be able to find them.

References