AMD Radeon Instinct MI100 is faster than Nvidia A100 in FP32 computation

AMD has seen how official slides have been leaked where it is revealed that the AMD Radeon Instinct MI100, based on the CDNA @ 7nm architecture, is faster than Nvidia’s top-of-the-range GPU, the A100 (Ampere), in FP32 computation.

From the AMD Radeon Instinct MI100, we know that it will arrive during the second half of this year offering no less than 8192 Stream Processors that would finally offer a TDP of 300W, which is not bad when you consider that a Radeon RX Vega 64 or Radeon Instinct MI60 They have a TDP of 300W for 4096 Stream Processors. This is thanks to a new and efficient graphic architecture together with the 7nm manufacturing process.

The filtration revolves around a 1U Rack server that will be equipped with 2x AMD EPYC ROME (Zen2) or MILAN (Zen3) CPUs accompanied by 4x AMD Radeon Instinct MI100 that give an FP32 (SGEMM) performance of 136 TFLOPs, which gives us an average of 34 calculation TFLOPs FP32 per GPU. The information is completed by indicating a capacity of 128 GB of HBM2E memory (32GB per GPU) with a bandwidth of 4.9 TB / s (1.22 TB / s per GPU).

On the other hand, we have a 3U Rack, where the differences are to admit up to 8x AMD Radeon Instinct MI100 GPUs reaching 272 TFLOPs of power with 256 GB of HBM2E memory to add a bandwidth of 9.8 TB / s and consume about 3kW of energy.

To finish, we have a comparison that indicates that the AMD Radeon Instinct MI100 is 2.4x times higher than the Nvidia A100 in FP32 performance and all this costing 30 percent less. For its part, the Nvidia A100 is 2.5x more powerful in FP64 performance with an extra cost of 15%.

AMD Radeon Instinct Accelerators 2020

Accelerator NameAMD Radeon Instinct MI6AMD Radeon Instinct MI8AMD Radeon Instinct MI25AMD Radeon Instinct MI50AMD Radeon Instinct MI60AMD Radeon Instinct MI100
GPU ArchitecturePolaris 10Fiji XTVega 10Vega 20Vega 20Arcturus
GPU Process Node14nm FinFET28nm14nm FinFET7nm FinFET7nm FinFET7nm FinFET
GPU Cores230440964096384040968192?
GPU Clock Speed1237 MHz1000 MHz1500 MHz1725 MHz1800 MHz1334 MHz?
FP16 Compute5.7 TFLOPs8.2 TFLOPs24.6 TFLOPs26.5 TFLOPs29.5 TFLOPs~50 TFLOPs
FP32 Compute5.7 TFLOPs8.2 TFLOPs12.3 TFLOPs13.3 TFLOPs14.7 TFLOPs~25 TFLOPs
FP64 Compute384 GFLOPs512 GFLOPs768 GFLOPs6.6 TFLOPs7.4 TFLOPs~12.5 TFLOPs
Memory Clock1750 MHz500 MHz945 MHz1000 MHz1000 MHzTBD
Memory Bus256-bit bus4096-bit bus2048-bit bus4096-bit bus4096-bit bus4096-bit bus
Memory Bandwidth224 GB/s512 GB/s484 GB/s1 TB/s1 TB/sTBD
Form FactorSingle Slot, Full LengthDual Slot, Half LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full LengthDual Slot, Full Length
CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive CoolingPassive Cooling?
~200W (Test Board)

source: AdoredTV


Related Articles

Back to top button
escort mersin escort şanlıurfa escort tekirdağ escort