microsoft corp is expanding its Azure cloud platform with a new family of instances designed to run artificial intelligence models.
The instance family known as the ND H100 v5 series made its debut today.
“Delivering on the promise of advanced AI to our customers requires supercomputing infrastructure, services and expertise to accommodate the exponentially increasing size and complexity of the latest models,” said Matt Vegas, a senior project manager for the supercomputing group and AI from Azure. wrote in a blog post. “At Microsoft, we’re addressing this challenge by bringing a decade of supercomputing experience and supporting the largest AI training workloads.”
Each instance of ND H100 v5 has eight H100 GPU units from Nvidia Corp. Unveiled last March, the H100 is Nvidia’s most advanced data center GPU. It can train AI models nine times faster than the company’s previous flagship chip and perform inference up to 30 times faster.
The H100 features 80 billion transistors manufactured using a four-nanometer process. It includes a specialized module known as the Transformer Engine, designed to accelerate AI models based on Transformer’s neural network architecture. The architecture supports many advanced AI models, including OpenAI LLC’s ChatGPT chatbot.
Nvidia has also equipped the H100 with other improvements. Among other things, the chip offers a built-in confidential data processing function. The feature can isolate an AI model in a way that blocks unauthorized access requests, including from the operating systems and hypervisors it’s running on.
Advanced AI models are typically deployed on multiple graphics cards rather than on one. GPUs used in this way have to regularly exchange data with each other in order to coordinate their work. To speed up the flow of data between their GPUs, companies often connect them using high-speed network connections.
The eight H100 chips in Microsoft’s new instances, the ND H100 v5, are linked together using Nvidia technology called NVLink. According to Nvidia, the technology is seven times faster than PCIe 5.0, a popular networking standard. According to Microsoft, NVLink offers a bandwidth of 3.6 terabits per second between the eight GPUs in its new instances.
The Instance series also supports another Nvidia networking technology called NVSwitch. While NVLink is designed to connect the GPUs together within a single server, NVSwitch connects multiple GPU servers together. This makes it easier to run complex AI models that need to be distributed across multiple machines in a data center.
Microsoft’s ND H100 v5 instances combine the H100 graphics cards with Intel Corp. CPUs. The CPUs come from Intel’s new scalable Xeon processor series of the 4th generation. Also known as Sapphire Rapids, the chip series made its debut in January.
Sapphire Rapids is based on an advanced version of Intel’s 10-nanometer process. Each CPU in the series includes multiple onboard accelerators, compute modules optimized for specific tasks. Thanks to the built-in accelerators, Intel says Sapphire Rapids offers up to 10x better performance than previous-generation silicon for some AI applications.
The ND H100 v5 line of instances is currently in preview.
Photo: efes/Pixabay Show your support for our mission by joining our community of experts Cube Club and Cube Event. Join the community that includes Andy Jassy, CEO of Amazon Web Services and Amazon.com, Michael Dell, Founder and CEO of Dell Technologies, Pat Gelsinger, CEO of Intel, and many more luminaries and experts.