2024-09-19
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
the duck knows first when the river water warms in spring. nvidia's current actions are revealing some new trends.
text|zhou luping and zhao yanqiu
edited by niu hui
not long ago, nvidia released a generative ai service for 3d modeling, which attracted widespread attention in the industry. if the previous generative ai was more about generating text, pictures, videos and other two-dimensional content, then this time nvidia is using generative ai to help companies build 3d assets, accelerate the development of digital twins and simulation industries, and also accelerate the application of ai in the physical world.
01
"cuda native" targets industry
as a global leader in accelerated computing, nvidia's actions are revealing some new trends.
nvidia founder and ceo jensen huang focused his latest insights on ai on how generative ai and accelerated computing can transform industries such as manufacturing through visualization in two fireside conversations at siggraph 2024. nvidia also launched a new set of nim microservices during the conference.
siggraph is the place to discuss the latest innovations in computer graphics. nvidia released generative ai models and nim microservices for openusd, geometry, physics, materials, etc. openusd is an open source software for data exchange within 3d scenes, and has gradually become a standard in many industries such as 3d vision, architecture, design, and manufacturing.
with the help of these models and services, developers can accelerate the development of industry applications such as manufacturing, automotive, and robotics.
in two fireside conversations, huang discussed the importance of building digital twins and virtual worlds. he said that the industry can improve efficiency and reduce costs by building large-scale digital twins at the scale of cities. "for example, ai can be trained in this virtual world before being deployed in the next generation of humanoid robots."
why did huang renxun focus on industrial visualization, virtual world or digital twin? and why did nvidia launch the new nim microservice in the cuda ecosystem at this time?
image from nvidia official website
as rev lebaredian, vice president of nvidia omniverse and simulation technology, said, the generative ai trend in heavy industry has already arrived. digital intelligence frontier also learned that generative ai is moving from some simple scenarios to complex production links. the above-mentioned technology ecosystem can accelerate this process.
“until recently, the primary users of the digital world have been creative industries; now, with the enhanced capabilities and accessibility that nvidia nim microservices bring to openusd, industries across the spectrum can create physics-based virtual worlds and digital twins to prepare for this new wave of ai technology,” said rev lebaredian.
in the automotive industry, domestic car companies are all rolling up their sleeves for digital twins. "tesla is about to release fsd12.5 and is also actively promoting the implementation of fsd in china," an artificial intelligence expert from a large chinese car company told digital intelligence frontline, "tesla regards simulation as a strategic goal, and we are also working on the metaverse to solve the problem of closed-loop data for autonomous driving." previously, it was difficult and costly for car companies to collect "ghost probe" data. now, car companies can solve the training of long-tail scenarios in the metaverse simulation environment.
in the robotics industry, an electric power inspection robot company is training ai in a simulation environment so that the robot can perceive the complex environment and physical space within the power plant in real time, plan its movement routes, and view thousands of meters on different devices along the way.
architectural design is a complex and time-consuming task, and 3d models are an essential deliverable in architectural design. however, for some complex geometric shapes and special-shaped structures, the reconstruction of 3d models is more difficult. now, some design companies are working with ai companies to try to generate models with only some pictures, sketches and text. different materials can also be given to architectural designs to improve the design.
in the steel industry, metallographic analysis is a method of using a microscope to examine the internal defects and structures of material slices to understand the overall performance of the base material. traditional manual work is inefficient and heavily dependent on human experience. now, a common demand of many steel companies is to use the previous knowledge base to train professional ai to conduct a comprehensive analysis of materials.
nvidia's new nim microservices allow application companies to call services directly without starting from scratch, and then combine their own data to quickly implement an application. therefore, some companies describe this as "cuda native".
with the implementation of generative ai from some edge scenarios to deeper scenarios, huang renxun said, "everyone will have an ai assistant." at the same time, the integration of ai and image technology is deepening. "almost every industry will be affected by this technology, whether it is scientific computing to better predict the weather with less energy, or working with creators to generate images, or creating virtual scenes for industrial visualization," huang renxun said, "generative ai will also completely change the field of robotics and self-driving cars."
02
what imagination does the new nim microservice bring?
all of the above-mentioned industry applications rely on the application of 3d modeling and simulation technology.
the construction of 3d content and scenes has always been a headache in the past. the chains and processes involved are very complicated, such as modeling, shading, animation, lighting, rendering, etc.
for decades, animation, visual effects, and game studios have been working to improve interoperability between the various tools in their pipelines, but with limited success. migrating data from one location to another is tricky, so studios have built complex workflows to manage data interoperability.
moreover, in addition to the fragmentation of systems and tools, the traditional 3d production process is a linear collaboration, involving format conversion and modification by multiple departments and multiple personnel, which is time-consuming and labor-intensive.
openusd is an open source, universal 3d data exchange framework. it was established in 2023 by nvidia, pixar, apple and other manufacturers. it can build virtual worlds through the interoperability between software tools and data types. it has extremely high interoperability and compatibility, and solves many challenges in workflow and complexity when creating three-dimensional scenes.
openusd is also the foundation of nvidia's omniverse platform. in a conversation with a senior writer for wired magazine, huang renxun once said: openusd is the first format that combines multimodal expressions of almost all tools. ideally, over time, people can introduce almost any format into it, allowing everyone to collaborate and keep the content forever. and generative ai will definitely help omniverse produce better simulation results.
the nim microservice developed by nvidia for openusd is also the world's first generative ai model for openusd development. it integrates the capabilities of generative ai into the usd workflow in the form of nim microservices, greatly reducing the threshold for users to use openusd. at the same time, nvidia also released a number of new usd connectors suitable for robot data formats and apple vision pro streaming.
image from nvidia official website
currently, three nim microservices have been released: the first is the usd code nim microservice, which can answer common sense openusd questions and automatically generate python code based on text prompts.
the second is the usd search nim microservice, which enables developers to use natural language or image input to search in massive openusd, 3d and image databases, greatly improving the speed of enterprise process retrieval and processing of materials.
the third is the usd validate nim microservice, which can check the compatibility of uploaded files with the openusd release version and generate rtx rendering path tracing images fully driven by the nvidia omniverse cloud api.
in addition to the native nim microservices provided by nvidia, ecosystem partners are also creating multiple popular ai models based on these microservices and providing them to users for inference optimization.
shutterstock, a world-renowned creative content platform, has launched a new text-to-3d service based on nvidia’s latest version of edify’s visual generative model, including the creation of 3d prototypes or filling virtual environments.
for example, creating lighting that accurately reflects light for virtual scenes is a complex task. previously, creators needed to operate expensive 360-degree camera equipment, go to the shooting location in person to create backgrounds from scratch, or search for similar content in a huge library.
but now, through the 3d generation service, users only need to describe the specific environment they need with text or pictures, and they can get high dynamic range panoramic images (360 hdri) with a maximum resolution of 16k. moreover, these scenes and components can be quickly switched, such as making a sports car appear in a desert, a tropical beach or a winding mountain road.
in addition to creating lighting, creators can also quickly add a variety of rendering materials, such as concrete, wood or leather, to build their own 3d assets. moreover, the 3d assets generated with the help of ai can also be edited at any time and provided in a variety of popular file formats.
nvidia's edify ai model is also helping getty images allow artists to arbitrarily control the composition and style of images, such as floating a red beach ball on a perfect coral reef photo. in addition, creators can also use corporate data to fine-tune the basic model to generate images that match the creative style of a specific brand.
these model microservices and tools are greatly accelerating brands’ creation of 3d assets and will make the development of digital twins more popular and convenient.
03
first movers have already started experimenting
as 3d content and asset creation becomes more convenient and accurate, industries such as industry, autonomous driving, engineering, and robotics are enjoying the benefits of generative ai. especially in the manufacturing and advertising creative industries, a group of first-mover companies are actively accelerating the application of digital twins and simulation through the nvidia omniverse platform.
coca-cola is the first brand to use the generative ai provided by omniverse and nim microservices for marketing scenarios. in a video of its demonstration, it only needs to input "build me a table with tacos and salsa, bathed in the morning light" in natural language into the system.
soon, usd search nim microservices can search for corresponding 3d assets in the huge 3d asset library and quickly call them through apis, while usd code nim can combine these models into scenes. developers can get python code for creating novel 3d worlds by entering prompts, which greatly enhances their creative ability. coca-cola can customize personalized images in more than 100 markets around the world through generative ai to achieve localized marketing.
image from nvidia official website
as the advertising service provider behind coca-cola, wpp has specially launched an intelligent marketing operating system. the system uses the omniverse development platform and openusd to create multilingual texts, images and videos in a very streamlined and automated manner, simplifying the content creation process for advertisers and marketers. by serving customers through generative ai, wpp has brought crazy ideas to reality.
as wpp cto said, “the beauty of these innovations is that it’s highly compatible with the way we work and takes full advantage of open standards. this not only accelerates future work, but also allows us to continue to build on and expand all of our previous investments in standards like openusd. by using nvidia nim microservices with nvidia omniverse, we’re able to launch innovative new production tools with companies like coca-cola at an unprecedented pace.”
as the world's largest consumer electronics oem, foxconn has built a virtual digital twin factory specifically for a new factory in mexico. engineers can define processes and train robots in a virtual environment, thereby improving the factory's automation level and production efficiency, saving time, cost and energy.
foxconn also uses the omniverse platform to build its digital twin, integrating all 3d cad elements into the same virtual factory, where it trains robots using nvidia isaac sim, a scalable robotics simulation platform developed on omniverse and openusd, to bring physically accurate and realistic visual presentation to its digital twin.
in addition to foxconn, electronic manufacturers including delta electronics, mediatek, and pegatron are using nvidia ai and omniverse to build factory digital twins.
xiaopeng motors used the omniverse platform during the design process for its mpv model xiaopeng x9. by introducing the model development workflow into the virtual world, xiaopeng motors was able to avoid the bottlenecks of traditional workflows when designing new cars.
for example, on the one hand, the omniverse platform has strong interoperability, which means that files and data used for industrial modeling, rendering, and 3d special effects no longer require complicated conversions, accelerating communication and collaboration among xiaopeng motors' design teams. on the other hand, xiaopeng motors uses omniverse's real-time rendering and ray tracing functions to achieve instant visualization of car color and interior changes, making virtual effects more realistic, helping to meet user needs, and thus improving product experience.
in the past two years, the popularity of generative ai has attracted more attention from the outside world to its applications in some toc and collaborative office fields, but now, the physical world will also usher in a new wave of explosions and opportunities.