Hive - SBCS Archive

Inside The Bee Hive - Jean Beaufort (CC0 1.0)

The Hive is an archive storage server accessible via sftp for the long term storage of files. It has approximately 125TB of ZFS based storage.

The Hive was purchased by SBCS and is managed by ITS Research. It is designed for longer term storage of files. It is backed up only by a matching storage server, held in a different location and synchronised periodically. This mirrored server can be accessed read-only.

Data stored in the Hive is stored in folders labelled by research group, similar to the group shares on Apocrita.

Hive is accessible via sftp or ssh and files can be copied on/off in much the same way as the cluster. Logging in via ssh allows use of standard Linux commands for file transfer only, not for further processing.


Access

To gain access to the Hive you need to send your ssh public key to its-research-support@qmul.ac.uk.

It is possible, by agreement with SBCS, for other Schools and Institutes to gain access to the archive upon request. Please contact us at the above address.

The master server, which users can write to, is called hive-master.hpc.qmul.ac.uk.

The slave is called hive-02.hpc.qmul.ac.uk and can be used for read-only access to files, it usually runs about 5-10 minutes behind the master.


Costing

SBCS levy a charge of £60 per TB per annum. Quotas are not currently enforced on the Hive.


Transferring Files

You can transfer files between Hive and Apocrita, using lftp, rsync or scp.

lftp is a command line file transfer client with many nice features. To connect with lftp type:

lftp sftp://hive-master.hpc.qmul.ac.uk

or

lftp fish://hive-master.hpc.qmul.ac.uk

Files can be synchronised using rsync using, where <source> is the file or directories you want to synchronise and <group> the root of your group's directory on the Hive. e.g. SBCS-ClareLab

rsync -avv -e ssh --progress <source> hive-master:/<group>/

Similarly scp would copy the same file

scp file hive-master:/ITSR-General/

You can query how much quota your group is using:

df /<group>

Use in jobs

While files stored on the hive cannot be used within a job on the cluster, it is possible to copy the files from the Hive onto GPFS or temporary file space and then copy them back afterwards.