Supermicro doubles GPU capabilities

Tue, 6th Oct 2020

FYI, this story is more than a year old

Supermicro is doubling GPU capabilities with a new 4U server supporting eight NVIDIA HGX A100 GPUs.

The company's GPU systems span 1U, 2U, 4U, and 10U GPU servers and SuperBlade servers over a wide range of customisable configurations.

"Supermicro has introduced a new 4U system, implementing an NVIDIA HGX A100 8-GPU baseboard (formerly codenamed Delta), that delivers 6x AI training performance and 7x inference workload capacity when compared to current systems," says Supermicro CEO and president Charles Liang.

"Also, the recently announced NVIDIA HGX A100 4-GPU board (formerly codenamed Redstone) is showing wide market acceptance, and we are excited by the ongoing global customer engagement. These new Supermicro systems significantly boost overall performance for accelerated workloads required for rapidly changing markets, including HPC, data analytics, deep learning training, and inference."

Leveraging Supermicro's advanced thermal design, including custom heatsinks and optional liquid cooling, the latest high-density 2U and 4U servers feature NVIDIA HGX A100 4-GPU 8-GPU baseboards, along with a new 4U server supporting eight NVIDIA A100 PCI-E GPUs (shipping today).

The Advanced I/O Module (AIOM) form factor further enhances networking communication with high flexibility.

The AIOM can be coupled with the latest high-speed, low latency PCI-E 4.0 storage and networking devices that support NVIDIA GPUDirect RDMA and GPUDirect Storage with NVME over Fabrics (NVMe-oF) on NVIDIA Mellanox InfiniBand that feeds the expandable multi-GPU system with a continuous stream of data flow without bottlenecks.

In addition, Supermicro's Titanium Level power supplies keep the system green to realise even greater cost savings with the industry's highest efficiency rating of 96%, while allowing redundant support for the GPUs.

"Supermicro systems powered by the NVIDIA A100 can quickly scale to thousands of GPUs or, using new multi-instance GPU technology, each A100 GPU can be partitioned into seven isolated GPU instances to run different jobs," says NVIDIA product management and marketing senior director Paresh Kharya.

"NVIDIA A100 Tensor Core GPUs with TensorFloat 32 provides up to 20 times more compute capacity compared to the previous generation without requiring any code changes."

2U design for HGX A100 4-GPU

This 2U system features the NVIDIA HGX A100 4-GPU baseboard with Supermicro's advanced thermal heatsink design to maintain the optimum system temperature under full load, all in a compact form factor.

The system enables high GPU peer-to-peer communication via NVIDIA NVLink, up to 8TB of DDR4 3200Mhz system memory, five PCI-E 4.0 I/O slots supporting GPUDirect RDMA as well as allowing four hot-swappable NVMe with GPUDirect Storage capability.

4U design with HGX A100 8-GPU

The new 4U GPU system features the NVIDIA HGX A100 8-GPU baseboard, up to six NVMe U.2 and two NVMe M.2, ten PCI-E 4.0 x16 slots, with Supermicro's AIOM support invigorating the 8-GPU communication and data flow between systems through the latest technology stacks such as GPUDirect RDMA, GPUDirect Storage, and NVMe-oF on InfiniBand.

The system uses NVIDIA NVLink and NVSwitch technology and is ideal for large-scale deep learning training, neural network model applications for research or national laboratories, supercomputing clusters, and HPC cloud services.

8U SuperBlade with 20 A100 PCI-E GPUs

The high density GPU blade server can support up to 20 nodes and 40 GPUs with two single-width GPUs per node, or one NVIDIA Tensor Core A100 PCI-E GPU per node in Supermicro's 8U SuperBlade enclosure.

The 20 NVIDIA A100 GPUs in 8U elevates the density of computing power in a smaller footprint allowing customers to save on TCO.

The SuperBlade provides a 100% non-blocking HDR 200Gb/s InfiniBand networking infrastructure to accelerate deep learning and enable real-time analysis and decision making.

High density, reliability, and upgradeability make the SuperBlade a perfect building block for enterprise applications to deliver AI-powered services.

The 1U GPU systems contain up to four NVIDIA GPUs with NVLink, including NEBS Level 3 certified, 5G/Edge-ready SYS-1029GQ.

Supermicro's 2U GPU systems, such as SYS- 2029GP-TR, can support up to six NVIDIA V100 GPUs with dual PCI-E Root complex capability in one system.

The 10U GPU servers, such as SYS- 9029GP-TNVRT, supports 16 V100 SXM3 GPU expansions with Dual Intel Xeon Scalable processors with built-in AI acceleration.

The flexible range of solutions powered by NVIDIA GPUs and GPU software from the NVIDIA NGC ecosystem provides the right building blocks for diverse tasks for organisations addressing various verticals – from AI inferencing on developed models to HPC to high-end training.