PDFtoText¶
PDFtoText is a tool for converting Portable Document Format (PDF) to text.
PDFtoText is available as a module on Apocrita.
Usage¶
To run the default installed version of PDFtoText, simply load the
pdftotext
module:
$ module load pdftotext
$ pdftotext --help
pdftotext version X.Y.Z
Usage: pdftotext [options] <PDF-file> [<text-file>]
-f <int> : first page to convert
-l <int> : last page to convert
-r <fp> : resolution, in DPI (default is 72)
...(output has been truncated)
For full usage documentation, run pdftotext --help
or see the
user guide.
Example job¶
Here is an example job running on 1 core and 1GB memory:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 1
#$ -l h_rt=1:0:0
#$ -l h_vmem=1G
module load pdftotext
# Convert PDF file to text
pdftotext input-file.pdf output-file.txt