HPC Cloud Service

IMSS offers a High Performance Computing (HPC) pay-as-you-go service in the cloud. With Caltech's HPC service, faculty and research groups who cannot expend large financial resources on cluster hardware and infrastructure now have the opportunity to run on a locally managed and supported HPC system.

Our Cloud HPC consists of a base of 60 nodes each containing dual-quad core Xeon's @ 2.66 GHz, 32GB of memory and up to 1TB of local scratch storage per node. We offer high performance Nvidia GPU enabled nodes as well as high speed Infiniband and standard GB ethernet. Special high memory capacity nodes can be made available by request. Cloud HPC users have the choice of using either high speed NFS or the Panasas parallel file system. Upgrades are planned for the near future which wlll provide much faster CPU, memory, network and storage.

The variety of applications available on the Cloud HPC are being extended all the time. Applications can range from opensource products such as NAMD and StochKit to closed source and privately licensed applications such as FDTD's Lumerical and Matlab (2011b.) If you require additional applications to be installed we would be happy to work with you.

Job output is copied back from the remote HPC cluster to your local IMSS unix home directory. This directory is conveniently accessible from the IMSS unix cluster, scp from another linux machine or via Windows file sharing.

Setup requires a PTA and the list of IMSS accounts who should have access to the cluster. Billing will occur at the end of each month.

Get Started
To start using the Cloud HPC cluster contact us at http://help.caltech.edu (request type IMSS-->High Performance Computing) with an interdepartmental account number (PTA/poeta) and a list of IMSS user accounts that should have access. We will enable your access.caltech account to connect to the HPC resources.


* The IMSS HPC Cloud service does not provide backups for data kept on the cluster. It is up to the end user to backup data. We highly recommend copying data from the cluster to another location on a regular basis to avoid the risk of potential data loss. *