Software on XStream

Operating system

XStream is a Linux GPU cluster running Red Hat Enterprise Linux 6.9.

Using Modules

Modules provide a convenient way to dynamically change the users’ environment through modulefiles. This includes easily adding or removing directories to the PATH environment variable.

Lmod is used as a replacement to the original module command. For more information, please take a look at the Lmod user guide

Listing modules

Check the modules that can be directly loaded using:

$ module avail

On XStream, modules follow a hierarchical module naming scheme, so only packages that can be directly loaded are displayed by module avail.

You can list all available modules using the spider sub-command:

$ module spider

Using the full name will give you details on how to load the module by listing any required dependencies:

$ module spider FFTW/3.3.4

Also, you can use module list to see currently loaded packages.

Loading packages

To load packages:

$ module load package1 package2 ...

Note: order matters when loading modules (eg. a same package can be available from different compilers).

Again, take a look at the Lmod user guide for more info.


Compiler toolchains

Compiler toolchains are basically a set of compilers together with libraries that provide additional support that is commonly required to build software. In the HPC world, this usually consists of an library for MPI (inter-process communication over a network), BLAS/LAPACK (linear algebra routines) and FFT (Fast Fourier Transforms).

List of available compiler toolchains on XStream:

  • foss/2015.05: the FOSS toolchain is a GNU Compiler Collection (GCC) based compiler toolchain, including OpenMPI for MPI support, OpenBLAS (BLAS and LAPACK support), FFTW3 and ScaLAPACK. This version is based on GCC 4.9.2.
  • intel/2015.5.223: Intel Cluster Toolkit Compiler Edition provides Intel C/C++ and Fortran compilers, Intel MPI & Intel MKL. Based on Intel Parallel Studio XE 2015 update 5.

To load the FOSS toolchain:

$ module load foss/2015.05

To load the Intel compiler toolchain:

$ module load intel/2015.5.223

Or you can just use the following command to load the default Intel compiler toolchain:

$ module load intel

Note: loading a toolchain will offer you additional software to load through module. Check module avail once your preferred compiler toolchain is loaded.

Intel Parallel Studio XE 2016

The 2016 edition of the Intel compilers is available on XStream as a separate module (it is NOT a real compiler toolchain for now).

You can load it using:

$ module load intel/2016

Nvidia CUDA

The following versions of CUDA are available on XStream: 6.5.14, 7.0.28, 7.5.18 (default), 8.0.44 and 8.0.61.

For example, to load CUDA 8.0.61, please use:

$ module load CUDA/8.0.61

The Nvidia driver version is currently v375.66 on all nodes.

Usage examples

This section provides an overview of how to use pre-installed software managed by the Research Computing team in collaboration with XStream users.

OpenMPI with Intel compilers


$ module load intel/2015.5.223 OpenMPI/1.8.8
$ mpicc --showme cpi.c
icc cpi.c -I...

OpenMPI with the GNU Compiler Collection

The foss toolchain includes gcc-based OpenMPI.


$ module load foss/2015.05
$ mpicc --showme cpi.c
gcc cpi.c -I...


ATLAS (Automatically Tuned Linear Algebra Software) is the application of the AEOS (Automated Empirical Optimization of Software) paradigm, with the present emphasis on the Basic Linear Algebra Subprograms (BLAS), a widely used, performance-critical, linear algebra kernel library.


$ module load foss/2015.05 ATLAS/3.10.2-LAPACK-3.5.0
$ gcc ... -lcblas -latlas


The NVIDIA CUDA Deep Neural Network library (cuDDN) is a GPU-accelerated library of primitives for deep neural networks.

Usage example:

$ module load CUDA/7.5.18 cuDNN/5.1-CUDA-7.5.18


GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.


$ module load intel/2015.5.223 CUDA/7.5.18 GROMACS/5.1-hybrid
$ gmx_mpi -h
                     :-) GROMACS - gmx_mpi, VERSION 5.1 (-:

NAMD 2.12

NAMD is a parallel molecular dynamics code designed for high-performance simulation of large biomolecular systems.

$ ml intel/2015.5.223 NAMD/2.12-smp-ibverbs-cuda

NVcaffe 0.15.3

NVcaffe is the Nvidia release of Caffe, a fast open framework for deep learning.


$ ml foss/2015.05 NVcaffe/0.15.13-CUDA-7.5.18-Python-2.7.9

OpenMM 6.3.1

OpenMM is a high performance toolkit for molecular simulation. On XStream, it is compiled against the foss toolchain (GCC 4.9.2) and CUDA 7.0.28.


$ ml foss/2015.05 OpenMM/6.3.1

PostgreSQL 9.5.2 with PG-Strom

PostgreSQL is an open source relational database management system (DBMS) developed by a worldwide team of volunteers.

PG-Strom is an extension of PostgreSQL designed to off-load several CPU intensive workloads to GPU devices, to utilize its massive parallel execution capability.


$ module load foss/2015.05 CUDA/7.5.18 PostgreSQL/9.5.2-Python-2.7.9
$ pg_config --version
PostgreSQL 9.5.2
$ initdb -D $WORK/postgres/data

Edit $WORK/postgres/data/postgresql.conf and add the following line to load the PG-Strom extension:

shared_preload_libraries = '$libdir/pg_strom'

Start the PostgreSQL server:

pg_ctl -D $WORK/postgres/data -l logfile start

Connect using your login name and the password postgres and create the pg_strom extension:

$ psql -U $USER postgres
psql (9.5.2)
Type "help" for help.

postgres=# CREATE EXTENSION pg_strom;

Stop the PostgreSQL server:

pg_ctl -D $WORK/postgres/data stop

Please always use PostgreSQL within a job, not directly on the login nodes.


PyTorch is a new deep learning framework with GPU acceleration. The early release 0.2.0 is now available on XStream as a special module that will load all required dependencies (Python 2.7.9, CUDA 8.0.41, cuDNN 5.1 and MAGMA 2.2.0).

Usage (Python 2.7):

$ ml pytorch/0.2.0

R 3.2.4

R is a free software environment for statistical computing and graphics.


$ ml foss/2015.05 R/3.2.4-libX11-1.6.3

RStudio 0.99.893

RStudio IDE is a powerful and productive user interface for R.


$ module load foss/2015.05 git/2.4.1 RStudio/0.99.893


TensorFlow is an Open Source Software Library for Machine Intelligence originally developped by Google with CUDA 8 support.

Usage (Python 2.7):

$ ml tensorflow/1.5.0

Usage (Python 3.6):

$ ml tensorflow/1.5.0-cp36

Note: tensorflow is a special module that will load the foss toolchain automatically.

Newer versions of TensorFlow should be run in Singularity containers.

TeraChem 1.9

TeraChem is general purpose quantum chemistry software designed to run on NVIDIA GPU architectures under a 64-bit Linux operating system.


$ module load terachem/1.9

Theano 0.9

Theano is a Python library that allows you to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. It was built with support for libgpuarray.

Usage example without MPI support:

$ module load foss/2015.05 Theano/0.9.0-Python-2.7.9-noMPI

Usage example with MPI support:

$ module load foss/2015.05 Theano/0.9.0-Python-2.7.9


Torch is a scientific computing framework with wide support for machine learning algorithms that puts GPUs first. It has been compiled against the foss toolchain (GCC based) and cuDNN 4.0 (CUDA 7.5.18).


$ module load torch/20160805-4bfc2da
$ th

Note: torch is a special module that will load CUDA, cuDNN and the foss toolchain automatically.

Currently available third-party packages are:

VMD 1.9.2

VMD is a molecular visualization program for displaying, animating, and analyzing large biomolecular systems using 3-D graphics and built-in scripting.

VMD on XStream is built against CUDA 7.0.28 and Nvidia OptiX 3.8.0, enabling the TachyonL-OptiX GPU-accelerated ray tracing renderer available in VMD 1.9.2. At least the following features should also be available: ACTC library support, collective variables, Python support, Pthreads, NetCDF, ImageMagick, ffmpeg and NetPBM.

Prerequisite: use ssh X11 forwarding by adding -X to your ssh command when connecting to XStream.

For non-computationally expensive tasks with VMD, you may launch VMD from a login node:

$ module load foss/2015.05 VMD/1.9.2-Python-2.7.9
$ vmd

To perform computationally expensive tasks with VMD, please launch VMD using srun with the X11 option as shown here (example with 1 task, 4 CPUs and 4 GPUs):

$ ml foss VMD
$ srun --x11=first -n1 -c4 --gres gpu:4 vmd

Have questions? Feel free to contact Research Computing Support.