VMware and NVIDIA today announced the expansion of their strategic partnership to help thousands of companies using VMware cloud infrastructure prepare for the AI era.
VMware Private AI Foundation with NVIDIA will enable enterprises to customize models and run a variety of generative AI applications such as intelligent chatbots, assistants, search and summarization, and more. The platform will be a fully integrated solution using generative AI software and accelerated computing from NVIDIA, built on VMware Cloud Foundation and optimized for AI.
Raghu Raghuram, CEO of VMware, said: “Generative AI and multi-cloud are a perfect match. Customers’ data is everywhere, in their data centers, at the edge, in the cloud, and more. Together with NVIDIA, we will help enterprises run their data with confidence. Run generative AI workloads nearby and address their concerns around enterprise data privacy, security, and control."
NVIDIA founder and CEO Jensen Huang said: “Enterprises around the world are racing to integrate generative AI into their businesses. By expanding our cooperation with VMware, we will be able to provide solutions for financial services, healthcare, manufacturing and other fields. Thousands of customers are provided with the full-stack software and computing they need to realize the full potential of generative AI with applications customized for their own data."
Full stack computing greatly improves the performance of generative AI
To realize business benefits faster, enterprises want to simplify and increase the efficiency of developing, testing and deploying generative AI applications. According to McKinsey, generative AI could add as much as $4.4 trillion to the global economy annually(1).
VMware Private AI Foundation with NVIDIA will help enterprises take full advantage of this ability to customize large language models, create more secure private models for internal use, and provide generative AI as a service to users more securely Run inference workloads at scale.
The platform plans to provide a variety of integrated AI tools that will help enterprises cost-effectively run mature models trained on their private data. The platform, built on VMware Cloud Foundation and NVIDIA AI Enterprise software, is expected to deliver the following benefits:
• Privacy: Customers will be able to easily run AI services wherever their data resides with an architecture that protects data privacy and secures access.
• Choice: From NVIDIA NeMo™ to Llama 2 and beyond, enterprises will have a wide range of choices in where to build and run their models, including leading OEM hardware configurations and future public cloud and service provider solutions.
• Performance: Recent industry benchmarks show that certain use cases running on NVIDIA-accelerated infrastructure match or exceed bare metal performance.
• Datacenter Scale: Optimized GPU scaling in virtualized environments enables AI workloads to scale to up to 16 vGPUs/GPUs on a single VM and multiple nodes, accelerating the fine-tuning and deployment of generative AI models .
• Lower Cost: All computing resources across GPUs, DPUs, and CPUs will be maximized to reduce overall costs and create a pooled resource environment that can be shared efficiently across teams.
• Accelerated storage: VMware vSAN Express Storage Architecture delivers performance-optimized NVMe storage and supports GPUDirect® storage via RDMA, enabling direct I/O transfers from storage to the GPU without the need for a CPU.
• Accelerated Networking: Deep integration between vSphere and NVIDIA NVSwitch™ technology will further ensure that multi-GPU models can be executed without inter-GPU bottlenecks.
• Rapid deployment and time to value: vSphere Deep Learning VM images and libraries will provide stable, turnkey solution images that come pre-installed with various frameworks and performance-optimized libraries for rapid prototyping.
The platform will be powered by NVIDIA NeMo, an end-to-end cloud-native framework included in NVIDIA AI Enterprise, the operating system of the NVIDIA AI platform, that enables enterprises to build, customize and deploy generative AI models virtually anywhere. NeMo combines a custom framework, guardrail toolkit, data wrangling tools, and pre-trained models to enable enterprises to adopt generative AI in a simple, affordable, and fast way.
To deploy generative AI into production, NeMo uses TensorRT for Large Language Models (TRT-LLM) to accelerate and optimize the inference performance of the latest LLM on NVIDIA GPUs. Through NeMo, VMware Private AI Foundation with NVIDIA will enable enterprises to import their own data and build and run custom generative AI models on VMware hybrid cloud infrastructure.
At the VMware Explore 2023 conference, NVIDIA and VMware will focus on how developers within the enterprise can use the new NVIDIA AI Workbench to extract community models (such as Llama 2 provided on Hugging Face), remotely customize these models and run them on Deploy production-grade generative AI in VMware environments.
Extensive ecosystem support for VMware Private AI Foundation With NVIDIA
VMware Private AI Foundation with NVIDIA will be supported by Dell, HPE and Lenovo. The three companies will be the first to offer systems powered by NVIDIA L40S GPUs, NVIDIA BlueField®-3 DPUs, and NVIDIA ConnectX®-7 SmartNICs that will accelerate enterprise LLM customization and inference workloads.
Compared to NVIDIA A100 Tensor Core GPU, NVIDIA L40S GPU can improve the inference performance and training performance of generative AI by 1.2 times and 1.7 times respectively.
NVIDIA BlueField-3 DPU accelerates, offloads and isolates massive computing workloads on the GPU or CPU, including virtualization, networking, storage, security, and other cloud-native AI services.
NVIDIA ConnectX-7 SmartNICs provide intelligent, accelerated networking for data center infrastructure to host some of the world’s most demanding AI workloads.
VMware Private AI Foundation with NVIDIA builds on a decade-long collaboration between the two companies. The joint research and development results of the two parties have optimized VMware's cloud infrastructure so that it can run NVIDIA AI Enterprise with performance comparable to that of bare metal. The resource and infrastructure management and flexibility provided by VMware Cloud Foundation will further benefit mutual customers.
Supply
VMware plans to release VMware Private AI Foundation with NVIDIA in early 2024.
The above is the detailed content of VMware and NVIDIA usher in the era of generative AI for enterprises. For more information, please follow other related articles on the PHP Chinese website!