NVIDIA, a renowned technology company, has introduced the GH200 Grace Hopper superchip platform, which includes the world's first HBM3e processor designed for accelerated computing and generative AI.
The GH200 Grace Hopper superchip platform is specifically developed to handle complex generative AI workloads, such as large language models, recommender systems, and vector databases.
Featuring the groundbreaking HBM3e processor, the platform offers exceptional memory and bandwidth capabilities. It allows for the connection of multiple GPUs, resulting in outstanding performance and a scalable server design. Compared to HBM3, the HBM3e memory is 50% faster and provides 10TB/second of combined bandwidth. This enables the platform to run models that are 3.5 times larger than previous versions, with three times faster memory bandwidth to enhance performance.
The platform comprises a single server equipped with 144 ARM Neoverse cores, delivering eight petaflops of AI performance. It also incorporates the latest HBM3e memory technology in a dual configuration, offering 282GB of memory. This represents a significant increase in memory capacity and bandwidth compared to current generation offerings.
Jensen Huang, the founder and CEO of NVIDIA, explains that the GH200 Grace Hopper Superchip platform meets the growing demand for generative AI by providing accelerated computing platforms with specialized requirements. The platform combines exceptional memory technology, high bandwidth, the ability to connect multiple GPUs, and a server design that can be easily implemented across data centers.
To facilitate the deployment of massive models used in generative AI, the platform utilizes the Grace Hopper superchip, which can be connected to additional Superchips through NVIDIA NVLink technology. This allows the GPUs to work together, benefiting from full access to CPU memory and providing a combined 1.2TB of fast memory in a dual configuration.
NVIDIA plans to offer the next-generation GH200 Grace Hopper superchip in various configurations. System manufacturers are expected to release systems based on this platform in the second quarter of 2024.