Skip to content

Pandoc

Pandoc is a versatile document conversion tool which can convert between numerous markup and word processing formats, including Markdown, HTML, LaTeX and PDF.

Pandoc is available as a module on Apocrita.

Pandoc is provided as a container

Pandoc requires a large number of packages which can only be practically installed inside a container. The pandoc command will work seamlessly on the cluster, however you can also access additional related packages installed in the container. This is demonstrated in the advanced example.

Usage

To run the default installed version of pandoc, simply load the pandoc module:

$ module load pandoc
$ pandoc --help
usage: pandoc [OPTIONS] [FILES]
  -f FORMAT --from=FORMAT
  -t FORMAT --to=FORMAT
  -o FILENAME --output=FILENAME
  --list-input-formats
  --list-output-formats
...(output has been truncated)

For full usage documentation, run pandoc --help or see the user guide.

Example of converting a markdown document to LaTeX with pandoc:

pandoc document.md -f markdown -t latex -s -o document.tex

Standalone option

When the -s/--standalone option is used, pandoc uses a template to add header and footer material that is needed for a valid self-standing document.

Example jobs

Serial jobs

Here are two example jobs each running on 1 core and 1GB memory:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load pandoc

# Convert markdown file to html
pandoc document.md -s -t html -o document.html
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load pandoc

# Convert LaTeX file to PDF
pandoc proposal.tex --from=latex --to=pdf -o proposal.pdf

Advanced example

The Apptainer container also has additional LaTeX packages required by pandoc. To use these (for example, pdflatex and pdfjoin), you need to call the commands from a container using apptainer, as below:

#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G

module load pandoc

# Use pdflatex to create a pdf
apptainer exec $(which pandoc) \
  pdflatex latex-doc.tex

# Join the output of previous command to another pdf
apptainer exec $(which pandoc) \
  pdfjoin cover.pdf latex-doc.pdf -o final.pdf

References