MSMC¶
MSMC (The multiple sequentially Markovian coalescent) is a method to infer population size and gene flow from multiple genome sequences.
MSMC is available as a module on Apocrita.
Usage¶
To run the default installed version of MSMC, simply load the msmc
module:
module load msmc
For usage documentation, run msmc -h
.
Example job¶
MSMC can take several input files, one for each chromosome. These files contain a segregating site, including a column to denote how many sites have been called since the last segregating site.
An example input file:
1 58432 63 TCCC
1 58448 16 GAAA
1 68306 15 CTTT
1 68316 10 TCCC
1 69552 8 GCCC
1 69569 17 TCCC
1 801848 9730 CCCA
1 809876 1430 AAAG
1 825207 1971 CCCT,CCTC
1 833223 923 TCCC
Serial job¶
Here is an example job running on 2 cores and 2GB of memory:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 2
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
module load msmc
msmc --fixedRecombination \
--outFilePrefix out \
--nrThreads ${NSLOTS} \
example.txt