Nvidia kicked off its second GTC conference of the year with news that its H100 “Hopper” generation of GPUs is in full production, with global partners planning to roll out products and services in October and wide availability in the first quarter of 2023.
Hopper features a number of innovations over Ampere, its predecessor architecture introduced in 2020. Most significant is the new Transformer engine. Transformers are widely-used deep learning models and the standard model of choice for natural language processing. Nvidia claims the H100 Transformer Engine can speed up neural networks by as much as six-fold over Ampere without losing accuracy.
Hopper also comes with the second generation of Nvidia’s Secure Multi-Instance GPU (MIG) technology, allowing a single GPU to be partitioned into several secured partitions that operate independently and in isolation.
Also new is a function called confidential computing, which protects AI models and customer data while they are being processed in addition to protecting them when at rest and in transit over the network. Lastly, Hopper has the fourth-generation NVLink, Nvidia’s high-speed interconnect technology that can connect up to 256 H100 GPUs at nine times higher bandwidth versus the previous generation.
And while GPUs are not known for power efficiency, the H100 enables companies to deliver the same AI performance with 3.5x greater energy efficiency and 3x lower total-cost-of-ownership than the prior generation, because enterprises need 5x fewer server nodes.
“Our customers are looking to deploy data centers that are basically AI factories, producing AIs for production use cases. And we’re very excited to see what H100 is going to be doing for those customers, delivering more throughput, more capabilities and [continuing] to democratize AI everywhere,” said Ian Buck, vice president of hyperscale and HPC at Nvidia, on a media call with journalists.
Buck, who invented the CUDA language used to program Nvidia GPUs for HPC and other uses, said large language models (LLMs) will be one of the most important AI use cases for the H100.
Language models are tools trained to predict the next word in a sentence, such as autocomplete on a phone or browser. LLM, as the name implies, can predict entire sentences and do more, such as write essays, create charts, and generate computer code.
“We see large language models being used for things outside of human language like coding, and helping software developers write software faster, more efficiently with fewer errors,” said Buck.
H100-powered systems from hardware makers are expected to ship in the coming weeks, with more than 50 server models in the market by the end of the year and dozens more in the first half of 2023. Partners include Atos, Cisco, Dell, Fujitsu, Gigabyte, HPE, Lenovo and Supermicro.
Additionally, Amazon Web Services, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure say they will be among the first to deploy H100-based instances in the cloud starting next year.
If you want to give the H100 a test drive, it will be available to try out via Nvidia’s Launchpad, its try-before-you-buy service where users can log in and test out Nvidia hardware, including the H100.