24.6 TFLOPS FP16 or 12.3 TFLOPS FP32 peak GPU compute Производительность.
With 24.6 TFLOPS FP16 or 12.3 TFLOPS FP32 peak GPU compute Производительность on a single board, the Radeon Instinct MI25 server accelerator provides single precision Производительность leadership for compute intensive machine intelligence and deep learning training applications. 1 The MI25 provides a Питаниеful solution for the most parallel HPC workloads. The MI25 also provides 768 GFLOPS peak double precision (FP64) at 1/16th rate.
16GB ultra high-bandwidth HBM2 ECC GPU Память.
With 2X data-rate improvements over previous generations on a 512-bit Память Интерфейс, next generation High Bandwidth Cache and controller, and ECC Память reliability; the Radeon Instinct MI25’s 16GB of HBM2 GPU Память provides a professional-level accelerator solution capable of handling the most demanding data intensive machine intelligence and deep learning training applications. 3
Unmatched Half and Single Precision Floating-Point Производительность
Up to 82 GFLOPS/watt FP16 or 41 GFLOPS/watt FP32 peak GPU compute Производительность.
With up to 82 GFLOPS/watt FP16 or 41 GFLOPS/watt FP32 peak GPU compute Производительность, the Radeon Instinct MI25 server accelerator provides unmatched Производительность per watt for machine intelligence and deep learning training applications in the datacenter where Производительность and efficient Питание usage is crucial to ROI. The MI25 also provides 2.5 GFLOPS/watt of FP64 peak Производительность.
64 Вычислительные блоки each with 64 Потоковые процессоры.
The Radeon Instinct™ MI25 server accelerator has 64 Вычислительные блоки, each consisting of 64 Потоковые процессоры, for a total of 4,096 Потоковые процессоры and is Основана на the next generation “Vega” Архитектура with a newly designed compute engine built on flexible new Вычислительные блоки (nCUs) allowing 16-bit, 32-bit and 64-bit processing at higher frequencies to supercharge today’s emerging dynamic workloads. The Radeon Instinct MI25 provides superior single-precision Производительность and flexibility for the most demanding compute intensive parallel machine intelligence and deep learning applications in an efficient package.
Built on AMD’s Next-Generation “Vega” Архитектура with World’s Most Advanced GPU Память
- Passively cooled GPU server accelerator Основана на next-generation “Vega” Архитектура using a 14nm FinFET Process. The Radeon Instinct MI25 server accelerator, Основана на the new “Vega” Архитектура with a 14nm FinFET process, is a professional-grade accelerator designed for compute density optimized for datacenter server deployments. The MI25 server accelerator is the ideal solution for single-precision compute intensive training applications in machine intelligence and deep learning and other HPC-class workloads, where Производительность per watt is important.
- 300W TDP board Питание, full-height, dual-slot, 10.5” PCIe® Gen 3 x16 GPU server card. The Radeon Instinct MI25 server PCIe® Gen 3 x16 GPU card is a full-height, dual-slot card designed to fit in most standard server designs providing a Производительность driven server solution for heterogeneous machine intelligence and deep learning training and HPC-class system deployments.
- Ultra high-bandwidth HBM2 ECC Память with up to 484 GB/s Пропускная способность. The Radeon Instinct MI25 server accelerator is designed with 16GB of the latest high bandwidth HBM2 Память for handling the larger data set requirements of the most demanding machine intelligence and deep learning neural network training systems efficiently. The MI25 accelerator’s 16GB of ECC HBM2 Память also makes it an ideal solution for data intensive HPC-class workloads.
- MxGPU SR-IOV Hardware Virtualization. The Radeon Instinct MI25 server accelerator is designed with support of AMD’s MxGPU SRIOV hardware virtualization technology to drive greater utilization and capacity in the data center.
- Updated Remote Manageability Capabilities. The Radeon Instinct MI25 accelerator has advanced out-of-band manageability circuitry for simplified GPU monitoring in large scale systems. The MI25’s manageability capabilities provide accessibility via I2C, regardless of what state the GPU is in, providing advanced monitoring of a range of static and dynamic GPU information using PMCI compliant data structures including board part detail, serial numbers, GPU temperature, Питание and other information.
Machine Intelligence & Deep Learning Neural Network Training
Training techniques used today on neural networks in machine intelligence and deep learning applications in data centers have become very complex and require the handling of massive amounts of data when training those networks to recognize patterns within that data. This requires lots of floating point computation spread across many cores, and traditional CPUs can’t handle this type of computation as efficiently as GPUs handle it. What can take CPUs weeks to compute, can be handled in days with the use of GPUs. The Radeon Instinct MI25, combined with AMD’s new Epyc server processors and our ROCm open software platform, deliver superior Производительность for machine intelligence and deep learning applications.
The MI25’s superior 24.6 TFLOPS of native half-precision (FP16) or 12.3 TFLOPS single-precision (FP32) peak floating point Производительность running across 4,096 Потоковые процессоры; combined with its advanced High Bandwidth Cache (HBC) and controller and 16GB of high-bandwidth HBM2 Память, brings customers a new level of computing capable to meet today’s demanding system requirements of handling large data efficiently for training these complex neural networks used in deep learning. 1 The MI25 accelerator, Основана на AMD’s Next-Gen “Vega” Архитектура with the world’s most advanced Память Архитектура, is optimized for handling large sets of data and has vast improvements in throughput-per clock over previous generations delivering up to 82 GFLOPS per watt of FP16 or 41 GFLOPS per watt of FP32 peak GPU compute Производительность for outstanding Производительность per watt for machine intelligent deep learning training deployments in the data center where Производительность and efficiency are mandatory.
Benefits for Machine Intelligence & Deep Learning Neural Network Training:
- Unmatched FP16 and FP32 Floating-Point Производительность
- Open Software ROCm Platform for HPC-Class Rack Scale
- Optimized MIOpen Deep Learning Framework Libraries
- Large BAR Support for mGPU peer to peer
- Configuration advantages with Epyc server processors
- Superior compute density and Производительность per node when combining new AMD Epyc™ processor-based servers and Radeon Instinct “Vega” based products
- MxGPU SR-IOV Hardware Virtualization Driving enabling greater utilization and capacity in data center
HPC Heterogeneous Compute
The HPC industry is creating immense amounts of unstructured data each year and a portion of HPC system configurations are being reshaped to enable the community to extract useful information from that data. Traditionally, these systems were predominantly CPU based, but with the explosive growth in the amount and different types of data being created, along with the evolution of more complex codes, these traditional systems don’t meet all the requirements of today’s data intensive HPC workloads. As these types of codes have become more complex and parallel, there has been a growing use of heterogeneous computing systems with different mixes of accelerators including discrete GPUs and FPGAs. The advancements of GPU capabilities over the last decade have allowed them to be used for a growing number of these parallel codes like the ones being used for training neural networks for deep learning. Scientists and researchers across the globe are now using accelerators to more efficiently process HPC parallel codes across several industries including life sciences, energy, financial, automotive and aerospace, academics, government and defense.
The Radeon Instinct MI25, combined with AMD’s new “Zen”-based Epyc server CPUs and our revolutionary ROCm open software platform provide a progressive approach to open heterogeneous compute from the metal forward. AMD’s next-generation HPC solutions are designed to deliver maximum compute density and Производительность per node with the efficiency required to handle today’s massively parallel data-intensive codes; as well as, to provide a Питаниеful, flexible solution for general purpose HPC deployments. The ROCm software platform brings a scalable HPC-class solution that provides fully open-source Linux drivers, HCC compilers, tools and libraries to give scientists and researchers system control down to the metal. The Radeon Instinct’s open ecosystem approach supports various Архитектураs including x86, Питание8 and ARM, along with industry standard interconnect technologies providing customers with the ability to design optimized HPC systems for a new era of heterogeneous compute that embraces the HPC community’s open approach to scientific advancement.
Key Benefits for HPC Heterogeneous Compute:
- Outstanding Compute Density and Производительность Per Node
- Open Software ROCm Platform for HPC Class Rack Scale
- Open Source Linux Drivers, HCC Compiler, Tools and Libraries from the Metal Forward
- Open Industry Standard Support of Multiple Архитектураs and Industry Standard Interconnect Technologies