Pigz¶
pigz is a parallel implementation of gzip for modern multi-processor, multi-core machines.
pigz is available as a module on Apocrita.
Usage¶
$ module load pigz
$ pigz --help
Usage: pigz [options] [files ...]
will compress files in place, adding the suffix '.gz'. If no files are
specified, stdin will be compressed to stdout. pigz does what gzip does,
but spreads the work over multiple processors and cores when compressing...
Example of compressing data with pigz:
pigz -p ${NSLOTS} data.tar
Core Utilisation
Use the -p ${NSLOTS}
flag as described above to ensure you use the correct
number of cores for your job.
Example of decompressing data with pigz:
unpigz -p ${NSLOTS} data.tar.gz
Decompression is single-threaded only
Decompression cannot be parallelised easily, so pigz uses a single thread for decompression, plus three other threads for reading, writing and checksum calculation, which can speed up decompression under some circumstances. Any additional cores allocated to decompression will be wasted.
Example job¶
Serial job¶
Here is an example job running on 4 cores:
#!/bin/bash
#$ -cwd
#$ -j y
#$ -pe smp 4
#$ -l h_rt=1:0:0
#$ -l h_vmem=2G
module load pigz
pigz -p ${NSLOTS} data.tar
References¶
List of websites for project page, manuals, tutorials etc.