news

Using Apple Vision Pro to control robots remotely, Nvidia: "Human-machine integration" is not difficult

2024-07-31

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Machine Heart Report

Editors: Du Wei, Chen Chen

Huang Renxun said: "The next wave of AI is robots, and one of the most exciting developments is humanoid robots." Now, Project GR00T has taken another important step.

Yesterday, Nvidia founder Huang Renxun talked about his humanoid robot universal basic model "Project GR00T" in his SIGGRAPH 2024 Keynote speech. The model has ushered in a series of updates in terms of functionality.

Zhu Yuke, assistant professor at the University of Texas at Austin and senior research scientist at Nvidia, tweeted a video demonstrating how NvidiaRoboCasa and MimicGen, a large-scale simulation training framework for universal household robots, are now integrated into NVIDIA’s Omniverse platform and Isaac robot development platform



Image source: https://x.com/yukez/status/1818092679936299373

The video covers NVIDIA's three computing platforms, including AI, Omniverse, and Jetson Thor, which are used to simplify and accelerate developer workflows. Through the joint empowerment of these computing platforms, we are expected to enter the era of humanoid robots driven by physical AI.



The biggest highlight is that developers can use Apple Vision Pro to remotely control humanoid robots to perform tasks.







Meanwhile, Jim Fan, another NVIDIA senior research scientist, said that the update of Project GR00T is exciting. NVIDIA uses a systematic approach to expand robot data and solve the most difficult problems in the field of robotics.

The idea is simple: humans collect demonstration data on real robots, and NVIDIA expands this data a thousand times or more in simulation. Through GPU-accelerated simulation, people can now use computing power to exchange for the time-consuming, labor-intensive and costly human data collection.

He talked about how not long ago he thought teleoperation was fundamentally unscalable because in the atomic world, we are always limited to 24 hours/robot/day. Nvidia's new synthetic data pipeline on GR00T breaks this limitation in the bit world.



Image source: https://x.com/DrJimFan/status/1818302152982343983

Regarding Nvidia's latest progress in the field of humanoid robots, some netizens said that Apple Vision Pro has found the coolest use case.



Nvidia begins to lead the next wave: physical AI

Nvidia also detailed the technical process of accelerating humanoid robots in a blog post. The full content is as follows:

To accelerate the development of humanoid robots worldwide, NVIDIA announced that it will provide a set of services, models and computing platforms to the world's leading robot manufacturers, AI model developers and software manufacturers to develop, train and build the next generation of humanoid robots.



The suite includes the new NVIDIA NIM microservices and framework for robotic simulation and learning, the NVIDIA OSMO orchestration service for running multi-stage robotic workloads, and teleoperation workflows that support AI and simulation and allow developers to train robots using small amounts of human demonstration data.

“The next wave of AI is robotics, and one of the most exciting developments is humanoid robots,” Huang said. “We are advancing the entire NVIDIA robotics stack and making it accessible to humanoid robot developers and companies around the world, giving them access to the platform, acceleration libraries and AI models that best suit their needs.”



Accelerate Development with NVIDIA NIM and OSMO

NIM microservices provide pre-built containers powered by NVIDIA inference software, enabling developers to reduce deployment time from weeks to minutes.

Two new AI microservices will allow roboticists to enhance generative physics-based AI simulation workflows in NVIDIA Isaac Sim.

The MimicGen NIM microservice generates synthetic motion data based on remote data recorded from spatial computing devices such as the Apple Vision Pro. The Robocasa NIM microservice generates robotic tasks and simulation environments in OpenUSD.

NVIDIA OSMO, a cloud-native managed service, is now available, allowing users to coordinate and scale complex robotics development workflows across distributed computing resources, whether on-premises or in the cloud. The emergence of OSMO greatly simplifies robotics training and simulation workflows, reducing deployment and development cycles from months to less than a week.

Advanced data capture workflow for humanoid robot developers

Training the underlying models behind humanoid robots requires a lot of data. One way to get human demonstration data is to use teleoperation, but this process is becoming increasingly expensive and lengthy.

The NVIDIA AI and Omniverse Teleoperation Reference Workflow, demonstrated at the SIGGRAPH computer graphics conference, enables researchers and AI developers to generate large amounts of synthetic motion and perception data from very small amounts of remotely captured human demonstrations.



First, the developers captured a small number of remote demonstrations using Apple Vision Pro. Then, they simulated the recordings in NVIDIA Isaac Sim and used the MimicGen NIM microservice to generate a synthetic dataset from the recordings.

Developers used real and synthetic data to train the Project GR00T humanoid robot base model, saving a lot of time and reducing costs. They then used Robocasa NIM microservices (a robot learning framework) in Isaac Lab to generate experience to retrain the robot model. Throughout the workflow, NVIDIA OSMO seamlessly allocated computing tasks to different resources, saving developers weeks of management work.

Expanding Access to NVIDIA Humanoid Robotics Developer Technology

NVIDIA offers three computing platforms to simplify humanoid robot development: the NVIDIA AI supercomputer for training models; NVIDIA Isaac Sim, built on Omniverse, where robots can learn and perfect skills in a simulated world; and the NVIDIA Jetson Thor humanoid robot computer for running models. Developers can access and use all or some of the platforms based on their specific needs.

The new NVIDIA Humanoid Developer Program gives developers early access to new products and the latest versions of NVIDIA Isaac Sim, NVIDIA Isaac Lab, Jetson Thor and the Project GR00T universal humanoid robot base model.

1x, Boston Dynamics, ByteDance, Field AI, Figure, Fourier, Galbot, LimX Dynamics, Mentee, Neura Robotics, RobotEra, and Skild AI are the first companies to join the early access program.

Developers can join the NVIDIA Humanoid Robotics Developer Program today to gain access to NVIDIA OSMO and Isaac Lab, and will soon gain access to NVIDIA NIM microservices.

Blog Link:

https://nvidianews.nvidia.com/news/nvidia-accelerates-worldwide-humanoid-robotics-development