Nvidia has released performance data for its forthcoming Hopper generation of GPUs, and the initial benchmarks are tremendous.
The metrics are based on MLPerf Inference v2.1, an industry-standard benchmark that analyzes the performance of inferencing tasks using a machine-learning model against new data.
Nvidia claims its Hopper-based H100 Tensor Core GPUs delivered up to 4.5x greater performance than its previous A100 Ampere GPUs. (Read more about Hopper: Nvidia unveils a new GPU architecture designed for AI data centers) It’s a remarkable jump in just one generation. For comparison, CPU benchmarks often grow 5% to 10% from one generation to the next.
Nvidia’s performance leap comes with a caveat, however. The 450% boost came on a single benchmark; there were a total of six benchmarks run. The other benchmarks yielded at or below two-fold improvements. Still, a doubling of performance in one generation is impressive.
The top gains came on the BERT-Large benchmark, which measures natural-language processing of the BERT AI model developed by Google and used in Google’s search engine, among other things. Nvidia says the BERT performance leap is due to Hopper’s Transformer Engine, which is specifically designed to accelerate training transformer models.
Ampere isn’t the only older Nvidia technology getting trounced. The company also benchmarked Jetson AGX Orin, its Ampere-based SoC for robotics and edge systems and a replacement for the Jetson AGX Xavier processor. In those tests, Orin ran up to 5x faster than Xavier while delivering an average of 2x better energy efficiency.
But I’m not writing the Ampere A100 obituary just yet. Thanks to improvements in Nvidia’s AI software, it is saying MLPerf figures for the Ampere have advanced by 6x since the A100 was first benchmarked two years ago.
Orin is available now. Hopper, which was first introduced in March, is due later this year.