NVIDIA DGX H100

The ultimate AI infrastructure system

 

A new era of performance with NVIDIA H100

The fourth-generation DGX AI appliance is built around the new Hopper architecture, providing unprecedented performance in a single system and unlimited scalability with the DGX POD and SuperPOD enterprise-scale infrastructures. The DGX H100 features eight H100 Tensor Core GPUs, each with 80MB of memory, providing up to 6x more performance than previous generation DGX appliances, and is supported by a wide range of NVIDIA AI software applications and expert support.

  • 8x NVIDIA H100 GPUs WITH 640 GIGABYTES OF TOTAL GPU MEMORY 18x NVIDIA® NVLink®
    connections per GPU, 900 gigabytes per second of GPU-to-GPU bidirectional bandwidth

  • 4x NVIDIA NVSWITCHES™
    7.2 terabytes per second of bidirectional GPU-to-GPU bandwidth, 1.5X more than previous generation

  • 8x NVIDIA CONNECTX®-7 and 2x NVIDIA BLUEFIELD® DPU 400 GIGABITS-PER-SECOND NETWORK INTERFACE
    1 terabyte per second of peak bidirectional network bandwidth

  • DUAL x86 CPUs AND 2 TERABYTES OF SYSTEM MEMORY
    Powerful CPUs for the most intensive AI jobs

  • 30 TERABYTES NVME SSD
    High speed storage for maximum performance

AI Training

Chart

AI Inference

Chart
Chart

The Transformer Engine uses a combination of software and specially designed hardware to accelerate transformer model training and inferencing, such as those commonly used in language models such as BERT and GPT-3. The Transformer Engine intelligently manages and dynamically switches between FP8 and FP16 calculations, automatically handling re-casting and scaling between the two levels of precision, speeding up large language models compared to the previous generation Ampere architecture.

The H100 Tensor Core GPUs in the DGX H100 feature fourth-generation NVLink which provides 900GB/s bidirectional bandwidth between GPUs, over 7x the bandwidth of PCIe 5.0.

Chart

Building on the capabilities of NVLink and NVSwitch within the DGX H100, the new NVLink NVSwitch System enables scaling of up to 32 DGX H100 appliances in a SuperPOD cluster with up to 57.6TB/s of aggregate bandwidth.

Previously generation GPU-accelerators did not support confidential computing, with data only being encrypted when at rest in storage or in transit across the LAN. Hopper is the first GPU architecture to include support for confidential computing, securing data from unauthorised access as it passes through the DGX H100. NVIDIA confidential computing provides hardware-based isolation of multiple instances sharing a H100 GPU using MIG, single-user H100 GPUs and between multiple H100 GPUs.

Video

Multi-Instance GPU (MIG) expands the performance and value of each NVIDIA H100 GPU. MIG can partition the H100 GPU into as many as seven instances, each fully isolated with their own high-bandwidth memory, cache, and compute cores. Now administrators can support every workload, from the smallest to the largest, offering a right-sized GPU with guaranteed quality of service (QoS) for every job, optimising utilisation and extending the reach of accelerated computing resources to every user.

Expand GPU access to more users

With MIG, you can achieve up to 7X more GPU resources on a single H100 GPU. MIG gives researchers and developers more resources and flexibility than ever before.

Optimise GPU utilisation

MIG provides the flexibility to choose many different instance sizes, which allows provisioning of right-sized GPU instance for each workload, ultimately delivering optimal utilization and maximizing data center investment.

Run simultaneous mixed workloads

MIG enables inference, training, and high-performance computing (HPC) workloads to run at the same time on a single GPU with deterministic latency and throughput.

Up to 7 GPU instances in a single H100

Dedicated SM, Memory, L2 Cache, Bandwidth for hardware QoS & isolation

Simultaneous workload execution with guaranteed quality of service

All MIG instances run in parallel with predicatable throughput & latency

Right-sized GPU allocation

Different sized MIG instances based on target workloads

Flexibility

To run any type of workload on a MIG instance

Diverse deployment environment

Supported with Bare metal, Docker, Kubernetes, Virtualised env.

Confidential Computing

Hardware-based isolation of individual MIG instances.

Chart

Dynamic programming is a popular programming technique that breaks down complex problems using two methods, recursion and memoization. Traditionally these tasks were run on CPUs or FPGAs, but the Hopper architecture introduces new DPX instructions, enabling the GPU to offload these computationally intensive algorithms, boosting performance by up to 7x.

GPU Cloud

The NGC provides researchers and data scientists with simple access to a comprehensive catalogue of GPU-optimised software tools for deep learning and high performance computing (HPC) that take full advantage of NVIDIA GPUs. The NGC container registry features NVIDIA A100 tuned, tested, certified, and maintained containers for the top deep learning frameworks. It also offers third-party managed HPC application containers, NVIDIA HPC visualisation containers, and partner applications.

Find out more
DGX A100 Pod

As an end-to-end AI solution provider, Scan can provide complete AI clusters featuring NVIDIA DGX AI appliances, certified storage platforms and networks in the form of DGX BasePOD and SuperPOD. These NVIDIA reference architectures push performance even further adding either NVIDIA Command Base orchestration software or NVIDIA Unified Fabric Manager respectively.

Find out more
Ai Storage

Deep learning appliances such as the DGX A100 only works as intended if the GPU accelerators are fed data consistently and rapidly enough that the maximum utilisation is delivered. Scan offers a wide range of AI-optimised storage appliances suitable for deployment with the DGX A100.

Find out more
Ai Storage

Run:ai Atlas combines GPU resources into a virtual pool and enables workloads to be scheduled by user or project across the available resource. By pooling resources and applying an advanced scheduling mechanism to data science workflows, Run:ai greatly increases the ability to fully utilise all available resources, essentially creating unlimited compute. Data scientists can increase the number of experiments they run, speed time to results and ultimately meet the business goals of their AI initiatives.

Find out more

Protect your Deep Learning Investment

NVIDIA DGX systems are cutting-edge hardware solutions designed to accelerate your deep learning and AI workloads and projects. Ensuring that your system or systems remain in optimum condition is key to consistently achieving the rapid results you need. Each DGX appliance has a range of comprehensive support contracts covering both software updates and hardware components, coupled with a choice of media retention packages to further protect any sensitive data within your DGX memory or SSDs.

Learn more
NVIDIA DGX H100
GPUs 8x NVIDIA H100 Tensor Core GPUs
GPU Specifications 16,896 CUDA cores & 528 TF32 Tensor Cores per GPU
GPU Memory 80GB per GPU - 640GB total
Host CPUs 2x Intel Xeon Platinum 8480C, total 112 cores / 224 threads
System Memory 2TB ECC Reg DDR5
System Drives 2x 1.92TB NVMe SSDs
Storage Drives 8x 3.84TB NVMe SSDs
Networking 8x single-port NVIDIA ConnectX-7 400Gb/s InfiniBand/Ethernet. 2x dual-port NVIDIA ConnectX-7 DPUs each with 2x 400Gb/s InfiniBand/Ethernet
Operating System DGX OS / Ubuntu Linux / Red Hat Enterprise Linux
Power Requirement 10.2kW
Size 8U
Weight 130kg
Operating Temperature Range 5ºC to 30ºC (41ºF to 86ºF)