Technical Specifications

Cray CS-Storm Node

The Cray CS-Storm is built to meet the most demanding compute requirements for production scalability, while also delivering a lower total-cost-of-ownership for researchers with accelerator workload environments.

XStream GPU cluster consists in:

  • 65 x Cray CS-Storm 2626X8N Compute Node
  • Mellanox FDR Infiniband interconnect (56 Gb/s links)
  • Redundant management servers with storage
  • Redundant sub-management servers for diskless deployment
  • Cray Sonexion® scale-out Lustre IB FDR storage system

Each 2626X8N Compute Node is made of:

  • 2 x Intel® Xeon® CPU E5-2680 v2 @ 2.80GHz
  • 256 GB of RAM
  • 8 x NVIDIA Tesla K80 (16 logical GPUs)
  • 3 x SSDs
  • 1 x FDR Infiniband (56 Gb/s)

GPUs

Overview

XStream GPU cluster is made of 65 compute nodes for a total of 520 Nvidia K80 GPU cards (or 1040 logical graphical processing units).

The table below summarizes the Tesla K80 GPU card specifications.

Features Tesla K80
GPU 2x Kepler GK210
Peak double precision FLOPS 2.91 Tflops (GPU Boost Clocks)
1.87 Tflops (Base Clocks)
Peak single precision FLOPS 8.74 Tflops (GPU Boost Clocks)
5.6 Tflops (Base Clocks)
Memory bandwidth (ECC off) 480 GB/sec (240 GB/sec per GPU)
Memory size (GDDR5) 24 GB (12GB per GPU)
CUDA cores 4992 ( 2496 per GPU)

Compute node GPU architecture

The hardware architecture of the compute nodes is that each CPU socket (PCI root) is connected to PLX switches that connect 4 K80 cards (8 GPUs) together. Please take a look at the following diagrams:

You basically have 2 domains, one for each CPU. You can use the lstopo command on a compute node (eg. xs-0001) for full details of the PCI bus.

If you plan on doing GPU peer-to-peer communication, the nvidia-smi command on a compute node will show you the GPUDirect topology matrix for the system:

$ nvidia-smi topo -m

PIX gives you the better latency, SOC is the worse.

Storage

Lustre

XStream compute nodes are connected to a large private Lustre storage system with fast I/O. Accessible through the $WORK and $GROUP_WORK environment variables, this parallel filesystem is provided by a Cray Sonexion appliance. This innovative and power efficient HPC storage system is made of:

  • 492 x 4 TB SAS hard drive
  • 48 x Lustre Object Storage Target (OST) - each 32 TB
  • 6 x Embedded Lustre Object Storage Server (OSS)

This system is capable of providing more than 22 GB/s of sustained Lustre bandwidth over the Infiniband interconnect and about 1.4 PB of usable space. This storage space is not replicated nor backed up.

XStream Sonexion Lustre Storage

Have questions? Feel free to contact Research Computing Support.