news

"Embodied Intelligence" swept the World Robot Conference. Professor Wang Tianmiao of Beihang University: Most of them are actually "embodied skills"

2024-08-26

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Currently, humanoid robots are still facing two major difficulties. One is the current "soft" side of the robot. The general large model and vertical professional model suitable for the robot are still in the breakthrough stage. In addition, the dexterous hand currently has difficulties in technology and cost that need to be overcome.
The cost of humanoid robots has not yet reached the requirements of customers. In addition, mass production also involves supply chain issues. At this stage, humanoid robots are mainly used to build various applications on scientific research platforms, including the opening of hardware, which is still a long way from what we call software development and application.
When "embodied intelligence" became the focus of discussion at the 2024 World Robot Conference, Wang Tianmiao, professor and doctoral supervisor and honorary director of the Robotics Institute of Beihang University, proposed a different perspective. He believed that most of the exhibits at this exhibition were embodied skills in vertical fields. At this stage, the industry is basically exploring general fields for embodied intelligent robots or humanoid robots.
Embodied Intelligence emphasizes that robots can achieve multi-tasking and human-machine interaction in complex environments through comprehensive perception, reasoning and autonomous decision-making, and have extensive cognitive capabilities; while Embodied Skills focuses more on specialized capabilities in specific scenarios, aiming to efficiently complete specific tasks, with more vertical applications and easier commercialization. The industry believes that Embodied Intelligence focuses on "broad and comprehensive" intelligence, while Embodied Skills focus on "specialized and refined" capabilities.
Wang Tianmiao, currently the honorary director of the Robotics Institute of Beihang University and the dean of Zhongguancun Zhiyou Research Institute, has been deeply involved in the robotics industry for more than 30 years. In 2020, Wang Tianmiao and 15 other scientists jointly established the "Zhiyou Scientist Fund", focusing on the fields of embodied intelligence, robots, and upstream core components of robots.
Wang Tianmiao, Honorary Director of the Robotics Institute of Beihang University and Director of Zhongguancun Zhiyou Research Institute
At this conference, the number of humanoid robots reached the highest level in history. The 27 humanoid robots on display have different forms, from flexible and coordinated operation of "arms" and "fingers", to walking on complex terrain with "feet", to developing a "brain" based on a large model of artificial intelligence. Wang Tianmiao believes that this phenomenon of a hundred flowers blooming is a normal phenomenon in the early stages of technological development. At present, humanoid robots are still facing two major difficulties. One is the current "soft" side of the robot. The general large model and vertical professional model suitable for the robot are still in the tough stage; in addition, the dexterous hand currently has difficulties in technology and cost that need to be overcome.
A humanoid robot was displayed at the World Robot Conference, attracting the attention of the crowd.
On August 23, Wang Tianmiao was interviewed by media including The Paper (www.thepaper.cn) on the topic of the difficulties and challenges currently facing embodied intelligence.
The following is the content of the interview with Wang Tianmiao by The Paper:
The robot's "soft" side and dexterous hands are the difficulties that need to be tackled
The Paper: During the conference, embodied intelligence became a hot topic in the industry. What do you think of this wave of embodied intelligence craze?
Wang Tianmiao:At present, I think there are three questions that we need to think about when thinking about the development of humanoid robots or embodied intelligent robots:
First, in what scenarios will robots be used? In the next three to five years, robots are likely to enter complex environments, and their safety and functionality will be effectively tested and tested by customers, which is a very important issue at present. In dangerous environments, industrial, household and even some pan-commercial scenarios, technology is constantly iterating.
Second, driven by applications, two things deserve special attention: one is the "soft" software of the robot, which is generated and learned based on big models and data. The most important thing about this wave of embodied intelligence is to enable robots with big models, thereby achieving human-computer interaction and breaking down complex tasks into subtasks. The second is the combination of various subtasks and the physical space in reality, which requires visual models and tactile models. Without vision, there is no spatial reasoning, and without touch, it is difficult to complete fine assembly or even operation.
Third, in addition to being able to walk steadily and safely, it is also important to have a pair of dexterous hands. Figure AI’s new products also focus on dexterous hands, and Tesla will also involve dexterous hands when updating its applications.
These three questions may be the three major focuses and hot topics of our embodied intelligence research.
At present, on the "software" side, general large models and vertical professional models are still in the critical stage.In addition, for generalized robots, whether wheeled or legged, the ultimate means of operation and interaction is through the hands, so dexterous hands are a difficult point that humanoid robots currently need to overcome in terms of technology and cost.
Most of the exhibits at this exhibition are actually about embodied skills in vertical fields. This is my superficial understanding. At this stage, I am basically exploring general fields.
Another humanoid robot demonstrated its dexterous hands
The Paper: From the perspective of humanoid robot development, what technical problems can large models focus on solving?
Wang Tianmiao:The core contribution of large models to humanoid robots is to achieve human-like interaction, reasoning, and environmental adaptability. However, there are still theoretical and technical challenges. Human cognition is layered, including conceptual logical cognition, perception, vision and touch, and limb coordination. The relationship between these different levels has not yet been fully sorted out. In addition, the algorithm selection of large models, whether supervised learning, reinforcement learning, end-to-end learning, or simulation learning, is still under exploration. In addition, there are also problems in data generation for training large models, especially the acquisition of actual operation data.
Large models are expected to play a role in general robots and specific operations, but in reality many tasks still require specialization and precision. We hope to cultivate "all-round" robots through large models, but this is an ideal and needs further exploration. Ultimately, it involves the combination of scientific research and application scenarios, as well as the balance of function, safety and cost.
At present, the mass production of humanoid robots is mainly aimed at scientific research platforms
Thepaper.cn: Why must we make humanoid robots? In industrial scenarios, special robots can also be used. Is there a substitution relationship between humanoid robots and special robots?
Wang Tianmiao:From the perspective of the development stage of technology and industry, humanoid robots plus large models may form a new category of robots. The particularly important application scenario of humanoid robots is complex space, where small batches and multiple varieties are difficult to achieve large-scale automation or even particularly dangerous scenarios; because it is a new species, it will involve many new structures, such as motor-driven perception integrated joints, sensors, data generation and services, etc., which may promote new application scenarios and application industries; in addition, using humanoid robots as a starting point may push the theory, technology and products of robots to a new stage.
There are currently two different views on the proportion of humanoid robots in the field of intelligent robots in the next 20 years. One optimistic view is that the market share of humanoid robots will exceed 50% or 60%, while another group of industry observers believe that humanoid robots may only occupy 20% or 30% of the market share. This is because they only solve part of the demand, while other types of robots, such as arm-type, crawler-type, wheeled, as well as collaborative and parallel robots, will meet diverse needs.
I personally think that the final form of humanoid robots depends first on the level of innovation in the underlying technology and secondly on the specific application scenarios and customer needs, that is, whether customers are willing to pay for the service costs and product functions. Therefore, we should not absolutely believe that humanoid robots will definitely work or not.
The Paper: This year, relatively cheaper humanoid robots priced below 100,000 yuan have appeared. Does this mean the eve of mass production of humanoid robots?
Wang Tianmiao:At this stage, whether it is 150,000 or 100,000 or cheaper, it is mainly for scientific research platform display. At present, customers' cost requirements for humanoid robots have not yet formed a closed loop, and mass production still involves supply chain issues. At this stage, it is mainly built on the scientific research platform, and various applications are built on the scientific research platform, including the opening of hardware, which is still a long way from what we call software development and application.
The Paper: The humanoid robots of each company are different now. For example, some have three fingers, some have five fingers, some have legs, and some may not have legs at all. Will humanoid robots have a unified shape in the future?
Wang Tianmiao:When any disruptive technology emerges, everyone has high hopes for it, so there will be a variety of robots in various forms, some of which can even turn their heads 180 degrees, and their waists and even hands can rotate at will. In the early 1970s, Japan had nearly 200 companies trying various robot applications, which have developed into joints and parallel structures today. I think the current state is very normal, and technology should continue to develop. But in the future, there will definitely be a number of standardized categories, because these categories will achieve the best in terms of efficiency, operation time, cost, etc., and the supply chain will gradually take shape.
However, this will take time. At present, it may take 10 years. Because in the development of humanoid robots, I think this is the only way for general artificial intelligence to become a reality and contact the physical world. The development of any science and technology takes a long time and cost. Whether it is automobiles, mobile phones, or robot research and development, each stage requires 10 to 20 years of iteration.
Many people are too optimistic and eager about the disruptive development in the future, and often exaggerate. But the reality is not like this. In the end, it depends on whether the technology is really needed, whether the function is complete, whether it involves social security, whether the cost is acceptable, and whether the industrialization standards are sound. This is a series of comprehensive considerations.
Many people hope that technology can make breakthroughs quickly, as if disruptive progress will be achieved the next day or the next year, but this is not realistic.
For specialized and innovative small businesses, it is recommended to start with embodied skills
The Paper: You have a background in scientific research and academia, and you also do research on the industry. What problems do you think need to be solved in the development of the robotics industry?
Wang Tianmiao:Generally speaking, scientists should engage in more basic research or general theories, such as focusing on general large models. As for embodied intelligence or embodied skills, the industry should verify them from the perspectives of application areas, corresponding supply chains, safety of use, effectiveness, and cost.
But now there is a phenomenon in scientific and technological innovation and industrial development: basic research and industrial and engineering research are becoming more and more closely connected. Not only is the time cycle shortened, but the three also promote and inspire each other, and promote each other's application, and are inseparable. In this process, there have been phenomena such as universities and enterprises jointly conducting basic research, and the industry and business community jointly conducting applied technology research with universities. The connection between scientific and technological innovation and industrial development is becoming more and more close, and the time cycle is getting shorter and shorter. It cannot develop in segments as we imagine.
The Paper: For startups, is it better to find relatively practical model methods for robots based on specific scenarios, or should they tackle a general large model with a relatively complete size?
Wang Tianmiao:If you are a small, specialized, innovative company, I suggest you focus on a specific application and start with embodied skills. This may be more likely to be recognized by customers, including training data and financial support from large companies. For companies with a lot of financing or even industry background resources, they may take a more general and generalized path. However, it must be implemented in the later stage.
For start-ups, the potential application needs of new species forms are worth exploring regardless of scale; secondly, we must pay attention to breakthroughs in upstream core components, including the functions of limb sensors, brain (embodied intelligence) and cerebellum (embodied skills).
In addition, can humanoid robots generalize their skills by combining large models and perception models? For example, they can solve programming-free problems in scenarios such as loading and unloading, polishing, carrying and cleaning. In the future, we hope to achieve automatic decomposition of complex tasks through large models, thereby achieving programming-free tasks. This will bring huge space for application scenarios.
The Paper reporter Yu Yan
(This article is from The Paper. For more original information, please download the "The Paper" APP)
Report/Feedback