news

Nvidia sends Blackwell samples this week, releases NIM update to support 3D and robotics model creation

2024-07-30

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Author: Li Dan

Source: Hard AI

NVIDIA unveiled new tools at SIGGRAPH 2024, the annual conference and exhibition on computer graphics and interactive technologies held in Denver, USA, on Monday, July 29th, Eastern Time.

Nvidia CEO Jensen Huang revealed at SIGGRAPH 2024 that this week Nvidia sent samples of the Blackwell architecture, a new chip architecture that was first released this year. At the same time, Nvidia announced a series of software updates, mainly involving the cloud-native microservice Nvidia inference micro service (NIM) for optimizing artificial intelligence (AI) reasoning, to promote large-scale deployment of AI models in enterprises.

When NVIDIA launched NIM in March this year, it introduced that NIM provides optimized inference microservices designed to shorten time to market and simplify the deployment of generative AI models anywhere in the cloud, data centers, and GPU-accelerated workstations. NIM supports AI use cases across multiple fields, including large language models (LLM), visual language models (VLM), and models for speech, images, video, 3D, drug development, medical imaging, etc.

Developers can test new generative AI models using NVIDIA-hosted cloud APIs, or host models themselves by downloading NIM and quickly deploy them on major cloud providers or locally using Kubernetes to reduce development time, complexity, and cost. NIM microservices simplify the AI ​​model deployment process by packaging algorithms, system and runtime optimizations and adding industry standard APIs. This enables developers to integrate NIM into their existing applications and infrastructure without extensive customization or expertise.

The update announced by Nvidia on Monday expands the NIM reasoning microservice library to cover physical world environments, advanced visual modeling, and various vertical applications. Nvidia has provided about 100 NIM reasoning microservices in the preview version, and now it is releasing the full version. For example, as part of Nvidia's new NIM, the 4K image generation API of visual media company Getty Images Holdings and the 3D image generator of Shutterstock Inc., a digital content provider of images, movies, music, etc., are coming online soon. Both use Nvidia's Nvidia Edify, which is a multi-modal architecture for visual generation AI.

On the same day, NVIDIA announced that it had partnered with Hugging Face, a natural language processing (NLP) toolset and platform, to launch Inference as a Service, which helps developers quickly prototype and deploy open source AI models hosted on Hugging Face Hub to production. Commentators said that this collaboration will simplify the deployment of AI models for developers.


fVDB uses real-world 3D data to create spatial intelligence

Among them, NVIDIA launched fVDB, which uses 3D data from the real world to create spatial intelligence. NVIDIA said that the generative material AI model can understand and perform fine or coarse motor skills in the material world. Spatial intelligence is required to understand the three-dimensional space of the material world and navigate in it. In order to provide this AI with a powerful and coherent framework that can handle the scale of reality, NVIDIA created fVDB, a deep learning framework designed for sparse, large-scale and high-performance spatial intelligence.

fVDB is built on OpenVDB, an industry-standard library of structures and procedures for simulating and rendering sparse volumetric data such as water, fire, smoke, and clouds. fVDB provides four times the spatial scale of previous frameworks, 3.5 times the performance of previous frameworks, and access to a large number of real-world datasets. It simplifies the process by combining functions that previously required multiple deep learning libraries.


Isaac Lab, an open-source modular framework, provides simulations to accelerate robot learning

NVIDIA also launched Isaac Lab, an open source modular framework for robot learning that can address the limitations of traditional training methods on robot learning skills.

Isaac Lab provides modular high-fidelity simulations for different training environments, offering physical world AI capabilities and GPU-driven physical world simulations.

Isaac Lab supports both imitation learning (imitating humans) and reinforcement learning (learning through trial and error), providing a flexible training approach for any robot implementation. It provides a user-friendly environment for training scenarios, helping robot manufacturers add or update robot skills based on changing business needs.


Building a VLM-powered visual AI agent with NVIDIA NIM and VIA microservices

NVIDIA has tailored NIM for AI in the physical world, supporting speech and translation, vision, and realistic animation and behavior. NVIDIA has launched VIA microservices, which are now available for download in developer preview.

VIA microservices can be easily integrated with NIM, giving users the flexibility to use any LLM or VLM model in NVIDIA's model preview API and downloadable NIM microservices API catalog. VIA microservices are an extension of NVIDIA's Metropolis microservices and are cloud-native building blocks that accelerate the development of VLM- and NIM-driven visual AI agents, whether deployed on the edge or in the cloud.

With generative AI, NIM microservices, and foundational models, users can now build apps with broad perception and rich contextual understanding with fewer models. VLM supports visual AI agents that can understand natural language prompts and perform visual question answering. Visual AI agents use computer vision capabilities to perceive and interact with the physical world and perform reasoning tasks.

These agents can fully unlock the possibilities of applications in all walks of life. They can significantly simplify the workflow of App development and provide revolutionary new perception capabilities, such as image or video summaries, interactive visual question and answer, and visual alerts. These visual AI agents will be deployed in factories, warehouses, retail stores, airports, traffic intersections, etc., and will help operations teams make better decisions by leveraging richer insights generated from natural interactions.


Omniverse Replicator helps address data shortages that limit model training

NVIDIA introduced how to build a custom synthetic data generation (SDG) pipeline for USD using the NIM microservice, which uses NVIDIA's Omniverse Replicator, an SDK built on top of the Universal Scene Description (OpenUSD) and NVIDIA RTX.

Developers can use NIM microservices and Omniverse Replicator, among others, to build SDG pipelines that support generative AI and address the shortage of real-world data that often limits model training.

Rev Lebaredian, VP of Omniverse and Simulation Technologies at NVIDIA, said:

“We built the world’s first generative AI model that can understand language, geometry, materials, physics, and space based on OpenUSD.”

Lebaredian said that since 2016, Nvidia has been investing in OpenUSD to enable industrial companies and material AI developers to develop high-performance models more easily and faster.

Nvidia is also working with Apple, which co-founded the OpenUSD Alliance, to build a hybrid rendering pipeline flow from Nvidia's network of graphics-ready data centers, the Graphics Delivery Network, to Apple's Apple Vision Pro.