- New
Accelerating the Most Important Work of Our Time
The NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration at every scale for AI, data analytics, and high-performance computing (HPC) to tackle the world’s toughest computing challenges. As the engine of the NVIDIA data center platform, A100 can efficiently scale to thousands of GPUs or, with NVIDIA Multi-Instance GPU (MIG) technology, be partitioned into seven GPU instances to accelerate workloads of all sizes. And third-generation Tensor Cores accelerate every precision for diverse workloads, speeding time to insight and time to market.
The A100 PCIe is a professional graphics card by NVIDIA, launched on June 22nd, 2020. Built on the 7 nm process, and based on the GA100 graphics processor, the card does not support DirectX. Since A100 PCIe does not support DirectX 11 or DirectX 12, it might not be able to run all the latest games. The GA100 graphics processor is a large chip with a die area of 826 mm² and 54,200 million transistors. It features 6912 shading units, 432 texture mapping units, and 160 ROPs. Also included are 432 tensor cores which help improve the speed of machine learning applications. NVIDIA has paired 40 GB HBM2e memory with the A100 PCIe, which are connected using a 5120-bit memory interface. The GPU is operating at a frequency of 765 MHz, which can be boosted up to 1410 MHz, memory is running at 1215 MHz.
Being a dual-slot card, the NVIDIA A100 PCIe draws power from an 8-pin EPS power connector, with power draw rated at 250 W maximum. This device has no display connectivity, as it is not designed to have monitors connected to it. A100 PCIe is connected to the rest of the system using a PCI-Express 4.0 x16 interface. The card measures 267 mm in length, and features a dual-slot cooling solution.
NVIDIA A100 for HGX | NVIDIA A100 for PCIe | |
---|---|---|
Peak FP64 | 9.7 TF | 9.7 TF |
Peak FP64 Tensor Core | 19.5 TF | 19.5 TF |
Peak FP32 | 19.5 TF | 19.5 TF |
Peak TF32 Tensor Core | 156 TF | 312 TF* | 156 TF | 312 TF* |
Peak BFLOAT16 Tensor Core | 312 TF | 624 TF* | 312 TF | 624 TF* |
Peak FP16 Tensor Core | 312 TF | 624 TF* | 312 TF | 624 TF* |
Peak INT8 Tensor Core | 624 TOPS | 1,248 TOPS* | 624 TOPS | 1,248 TOPS* |
Peak INT4 Tensor Core | 1,248 TOPS | 2,496 TOPS* | 1,248 TOPS | 2,496 TOPS* |
GPU Memory | 40 GB | 40 GB |
GPU Memory Bandwidth | 1,555 GB/s | 1,555 GB/s |
Interconnect | NVIDIA NVLink 600 GB/s** PCIe Gen4 64 GB/s |
NVIDIA NVLink 600 GB/s** PCIe Gen4 64 GB/s |
Multi-instance GPUs | Various instance sizes with up to 7MIGs @5GB | Various instance sizes with up to 7MIGs @5GB |
Form Factor | 4/8 SXM on NVIDIA HGX™ A100 | PCIe |
Max TDP Power | 400W | 250W |
Delivered Performance of Top Apps | 100% | 90% |