news

"zhihuijun" stopped updating for a year: if you want to win the humanoid robot, you have to start a business like running a big company

2024-09-04

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

the humanoid robot industry, which has not yet figured out many basic issues, will face fierce competition.

by he qianming
edited by cheng manqi

when we met peng zhihui at 3 pm, he had not yet eaten his first meal of the day. one meal a day is peng zhihui's normal state all year round. he feels that he is "used to it, it does not affect his life, and eating is a waste of time."

in february 2023, peng zhihui co-founded zhiyuan robotics and served as cto, committed to making commercial humanoid robots. before that, he was better known as a "genius boy" and one of the "top 100 up masters" on bilibili.

peng zhihui started posting various hardcore diy videos under the nickname "zhihuijun" in 2017. now his b station account has more than 2.5 million followers. up hosts of the same size usually work as a team, but he works alone as a part-time worker. many people think that "full-stack engineer" is difficult to describe his ability, and call him an "overflowing stack engineer."

after zhiyuan robotics was founded, peng zhihui put all his focus on the company and he basically never took a weekend off. the last time b station released a self-made video was in april 2023.

he is extremely devoted because he wants to achieve greater success. "our vision is to create unlimited productivity with intelligent machines," said peng zhihui.

because many basic problems in the humanoid robot industry have not yet been solved, it will face fierce competition. since last year, no less than 20 humanoid robot companies have emerged in china. zhiyuan robot, which wants to send robots to factories, will also have to compete with the previous wave of "ai + robot" companies established around 2016. they have served customers longer and have more experience; in the future, the greater threat will be car companies that are eager to move towards embodied intelligence. they can mobilize resources far more than ordinary startups, and they have their own factories, which is a ready-made scenario.

zhiyuan robot believes that the method with the highest probability of winning is to "aim high" and start a business in the way of operating a large company and manipulating the big market.

while most of its peers focused on researching one product, zhiyuan robotics took a year and a half to produce two generations of humanoid robots, and released five humanoid robots at once in august this year, and also provided data collection solutions to its peers. earlier this year, zhiyuan robotics also released a commercial cleaning robot.

zhiyuan robotics released five products on august 18.

most humanoid robot companies will focus on a certain technical direction first, some focus on hardware, some focus on software intelligent systems. in terms of form, some focus on the operation ability of the upper body, and some prioritize the movement ability of the lower body.

zhiyuan robotics wants it all: they make complete humanoid robots and also develop branching forms based on different scenarios; they develop their own core components such as joint motors, "dexterous hands", and large multimodal models, while researching current mainstream technologies of embodied intelligence such as reinforcement learning, imitation learning, and visual models.

in february this year, when most new companies were still assembling robots on a contract basis, zhiyuan robotics, which had just been established for a year, began to build the first phase of its factory in lingang, shanghai, and planned to start production at the end of the year.

"i don't quite understand it" and "it's not a regular startup" are the comments of several technology investors. they seldom see startups working on so many products at the same time and betting on so many technological directions. if management and subsequent funding can't keep up, problems can easily arise.

peng zhihui said that they did not do this intentionally, but the characteristics of the industry forced them to do so: "the research and development of humanoid robots involves a lot of issues such as hardware, software, algorithms, supply and manufacturing, and system engineering is extremely important. in terms of technical implementation, some are engineering problems, and some are scientific problems. this industry has a large space, high barriers to entry, and fast changes. talent, capital, and commercial results will eventually be concentrated in the top. if you just do small things, it will be difficult to stay ahead."

the most capital voted for zhiyuan robotics. in the one and a half years since its establishment, the company has raised at least 1.5 billion yuan, with a valuation of 7 billion yuan. investors include more than 20 institutions such as hillhouse, sequoia, gaorong, bluerun, byd, saic, sanhua intelligent control, and lingang new area. zhiyuan also awarded awards to some investment institutions at the shareholders' meeting earlier this year.

when filming peng zhihui, documentary director takeuchi ryo was surprised that he was proficient in so many cross-disciplinary skills and asked, "is there anything you don't know?" peng zhihui said, "i can't have children." this time we asked the same question again, and he said, "i will never stop learning."

most of the books peng zhihui puts on his workstation are textbooks such as "thomas calculus" and "principles of compilation". the only exception is "elon musk biography". musk is one of his study objects.

"learning in school is more bottom-up, first lay a solid foundation and then do upper-level applications. after leaving school, it is more top-down, project-oriented, and learn whatever you need." peng zhihui said. when talking about time management methods, he said it is "preemptive scheduling in the operating system", dynamically adjusting task priorities, allowing interruptions, but focusing on the current task at any time.

he doesn't like the title of "genius boy". he thinks that geniuses are the scientists in textbooks who can change the course of human development.

compared to the small projects that he could complete on his own in a few months in the past, it was much more difficult to participate in the creation of a company and develop successful products. he had to move from working alone to leading a team, and the things he made could not only be cool, but also make money. this was a new and more complex learning process.

the following is a conversation between latepost and peng zhihui:

sprint patiently and stay at the table

"latepost": back to the starting point, why did you decide to start a business in 2023?

peng zhihui: humanoid robots are not new. i started working on humanoid robots relatively early. i started my own business and made a prototype in 2016 when i was still in school. at that time, “embodied intelligence” had not yet appeared. i mainly worked on the robot hardware and received 5 million yuan in investment.

the popularity of embodied intelligence and humanoid robots in the past two years is not due to some new technology, but mainly due to the emergence of ai and large models, which allow robots to have more application value beyond the body. i work on ai computing at huawei ascend, and i can see this trend clearly. this is one of the reasons why i started my own business.

latepost: if you want to work at huawei, you should be supported. why did you choose to start a business?

peng zhihui: it’s a question of style. huawei’s main focus in technology is infrastructure, such as operating systems, chips, databases and other very basic root technologies.

robotics is a very new track, especially ai robots, which are more suitable for teams with faster iterations. entrepreneurship requires not only a good idea or good technology, but more importantly, doing the right thing at the right time.

latepost: is it the right time to start a business in humanoid robots? the hardware, materials, energy and other technologies required are not yet mature, and some are still in the scientific exploration stage. a senior technology investor said that he is not interested in things that will happen in 5-10 years.

peng zhihui: we didn’t think about accomplishing this in the short term. we asked everyone to “sprint patiently”, on the one hand, the iteration and innovation speed should be very fast, and on the other hand, we should be patient.

there will be various positive and negative feedbacks in the market in the short term, but for a startup, the most important thing is to stay at the table.

latepost: a very important factor in staying at the table is having enough money. zhiyuan’s financing speed and scale are ahead of its peers. how did you do it?

peng zhihui: i think we can look at it from the investors’ perspective and ask why they chose us?

first of all, we have a very hardcore founding team background; we also have a full-stack technology layout: we not only do ontology, nor do we only do embodied ai, we have a layout of ontology, embodiment, and data, and our capabilities may be more comprehensive than those of our competitors; secondly, we also have a very pragmatic understanding of the entire track, and have been accumulating system capabilities for mass production, and the team’s execution capabilities are super strong.

at last year's press conference (august 2023), our entire humanoid r&d team had less than 50 people, and now a year later, it has exceeded 300. the scale dares to grow so fast, the premise is that we have thought clearly about what to do, the technical implementation path at each stage, and the logic of commercialization, which is also what investors value.

latepost: what kind of company does zhiyuan want to be in the long run? what are your benchmarks?

peng zhihui: for example, bell invented the telephone, which changed the way people communicate. humanoid robots may also change social productivity in the future. our company's vision is to "use intelligent machines to create unlimited productivity."

intelligence ultimately determines what robots can do

"latepost": after more than a year of starting a business, which aspects of humanoid robots are more difficult than you expected?

peng zhihui: it is not too difficult to get from 0 to 40, 50 or even 60 points, but it is very difficult to go higher.

the ontology is not mature enough. for example, most people walk with their legs curled up, while the flexibility of their arms and dexterity are still at a primary stage. then there is intelligence. now everyone is trying to imitate learning, but can it really achieve good results in complex open environments? in addition, multi-modal fusion perception and operation combining touch, sound and vision are not done very well. some of this is an engineering problem, and some is still at the scientific problem stage. the industry has not seen a very mature solution yet.

we are doing preliminary research while implementing the product. we need to keep a steady pace to demonstrate our competitive advantage. full stack is also one of our advantages, so we may be able to go further in the future.

the dexterous hands developed by zhiyuan robotics are still far behind human hands. human hands have 27 degrees of freedom, are flexible in movement, are covered with highly sensitive tactile nerves, and have strong force control capabilities. to make a dexterous hand, flexible, durable, stable and cost-controlled hardware components and materials are only the basis. human hands also have the ability to coordinate hands and brains. for example, people can catch moving objects because the human brain has cognition of physical laws such as gravity and acceleration, and can therefore predict the trajectory. the body has the ability to move, so it can complete this seemingly simple action.

latepost: we understand that zhiyuan has invested more manpower in the intelligence of robots. do you think intelligence is more important than the body?

peng zhihui: the ontology is definitely the foundation, but we have always felt that the core of (humanoid robots) is not the ontology, but what practical tasks it can accomplish and what valuable applications it can produce based on the ontology.

therefore, the value of the entity must be reflected through embodiment. you can see the original intention of our naming zhiyuan: "zhi" means embodied intelligence, and "yuan" means the two legs of the humanoid robot entity (pictographic).

"latepost": a humanoid robot can stand up, walk, and then climb stairs. is this a problem of intelligence?

peng zhihui: it is actually more of an intelligence problem, not just a hardware problem. high-level athletic ability is mainly reflected by the brain. in many sports, there is no essential difference between the bodies of normal people and athletes. athletes are more likely to achieve better perception and feedback of the environment through long-term training by allowing the cerebrum and brain to work together.

latepost: there are also differences in the industry on how to obtain training data. for example, some companies believe that remote-controlled robots cannot collect enough data and should use a large amount of simulation data. why does zhiyuan build hundreds of robots specifically for data collection?

peng zhihui: data sources are divided into the following categories: internet data, simulation data, and real collected data. if we only rely on simulation and generated data, we may encounter problems such as hallucinations and sim2real gap when training large models with chatgpt. therefore, real machine data is indispensable and has the greatest value, but its volume will not be as large as simulation data, and may account for 10%.

we also need to consider the cost of data collection. some companies do not have strong hardware capabilities. we have done a good enough job on the main body, so we have the conditions to have hundreds of robots collecting data in the second half of this year. we will find some scenarios where people can remotely control robots that cannot perform tasks autonomously. let customers pay for this part of the function and reduce hardware costs.

"latepost": in your opinion, what will a general humanoid robot eventually look like?

peng zhihui: it’s like the kind in science fiction movies. for example, i, robot. science fiction movies raise everyone’s expectations, but we hope to make it in the end.

"late": it is a superhuman state, not like a human.

peng zhihui: it may be difficult to have an intermediate state. once a humanoid robot reaches the level of a normal human, it will soon far surpass humans.

"late": how far is that future from now?

peng zhihui: i am quite optimistic and there is hope for the next 10 years.

the earliest scene where humanoid robots were used was "ppt"

latepost: you said this year is the first year of zhiyuan's commercialization. in fact, many people think that humanoid robots are still in the early stages and have just learned to walk, so there is no need to go to work so soon.

peng zhihui: we are a commercial company. commercialization is the most effective way to test our achievements. only by selling our products and getting positive feedback from customers can we attract more talents. no one will join a team that has no prospect of landing and little hope.

"latepost": you said that the commercialization path for humanoid robots is first the factory and then the family. now that they are in the factory stage, what scenarios can they be used for?

peng zhihui: the fastest to be implemented at present is "ppt", pick, place and transfer.

when dexterous hands become more mature, they can also perform various assembly tasks, because they have various complex perception capabilities such as touch and force. the volume of this scenario will be larger. we have evaluated that ppt may only account for less than 20% of the entire manufacturing industry scenario.

latepost: you said you will sell 300 robots this year. 200 of them will have legs and 100 will move on wheels. more specifically, what jobs will they do?

peng zhihui: the main products sold are the expedition series. the lingxi series are not sold and are all open source.

in the expedition series, the reliability and freedom of the bipedal (humanoid robot) need to be further upgraded. now it is more suitable for some scenes that reflect the human-computer interaction ability, mainly in the service industry. for example, the welcome guide in the car store shown at our press conference. the wheeled one is a2-w, which does "ppt" and other jobs in the factory.

zhiyuan’s humanoid robot yuanzheng a2 works as a shopping guide in a car store.

latepost: there are also some robot companies in the industry, such as mecha-mind and siling, which have added vision and force control capabilities to robot arms, and used large models to make ppts in factories. moreover, they have been working on stability and cost for a longer time. what are the advantages of your new solution?

peng zhihui: they are indeed the company that is closest to embodied intelligence among robot arm manufacturers. we have evolved from a single arm to a dual arm, and then also have a mobile chassis. in some high-level dual-arm planning, we will have some technical differentiation.

for example, in a typical transfer scenario, it is difficult to move a pallet with a single arm, so it is more reasonable to use two arms to do this task. the transition from a single arm to a dual arm sounds simple, but it actually involves complex force control and dual-arm coordination, environmental perception, dynamic obstacle avoidance, trajectory planning, etc. more importantly, we need to make these capabilities into generalizable standard skills to reduce the deployment cost of the robot.

zhiyuan robot a2-w moves pallets in the factory.

"latepost": can the a2-w sold by zhiyuan solve these problems?

peng zhihui: we can solve part of the problem. the key is to find the matching point between customer needs and our existing technologies. our investors are also quite supportive and have indeed provided some very valuable scenarios that can be implemented in the short term.

latepost: you mentioned earlier that you are very concerned about cost. last year you said that the price of humanoid robots should be reduced to less than 200,000. how did you calculate this number?

peng zhihui: it depends on what the customer expects. the cost of a worker is about 100,000 yuan a year, and the robot can work in two shifts. 200,000 customers can get their investment back in about one or one and a half years. we have also broken down the cost of parts and components. as long as the volume is large enough, we can indeed reach this level.

"latepost": how far is zhiyuan's product from reaching 200,000?

peng zhihui: not far away. our target cost for batch delivery is this, otherwise we may lose money.

"latepost": who will be your competitors?

peng zhihui: this track is big enough, and everyone is still in the stage of making the cake bigger. there are no direct competitors yet.

some potential competitors have not yet made their move. for example, new energy vehicle manufacturers have the motivation to do this. if humanoid robots show their value on the production line, they will definitely invest resources to do it. they also have the ability to do this. if you look at the various core components used in the robot body, motors, electronic controls, batteries and other technologies, these are what new energy vehicles are good at.

latepost: the humanoid robots you envision are very hardcore, but the commercial products now seem boring. do you feel conflicted?

peng zhihui: there is no conflict. we are doing r&d in parallel. we have several lines, one is productization and mass production, and the other is exploration and pre-research. we often say internally that we have to eat what is in the bowl, look at what is in the pot, and think about what is in the field.

run a startup like a big company

latepost: you released five humanoid robot products in august, and you also sell joint motors, fixtures, and provide data collection solutions. this year you also released a commercial cleaning robot. are your product lines too broad?

peng zhihui: this is actually a main line landing plus many branch line explorations. we think it is still necessary to make these arrangements. we call it saturation investment. on the one hand, we have the conditions, and at the same time, the entire track is not yet clear.

of course, we will also control the resource input of the branch line. for example, we released two products made by the x-lab laboratory this time. in fact, there are less than 5 full-time people involved. from the new joints to the structural software and hardware algorithms, they are all made by these few people. in x-lab, i pursue labor efficiency.

the commercial cleaning robot is a relatively mature product that can provide pre-production assistance for our humanoids, such as channels, manufacturing, and after-sales service. it can also reuse some of the capabilities of the humanoids, such as adding a robotic arm to the cleaning robot to allow it to do more things.

latepost: in terms of technology layout, many humanoid robot companies do not do full stack. for example, in terms of hardware, some companies do not study dexterous hands, but use grippers; some companies think legs are unnecessary and only make upper bodies. in terms of embodied intelligent systems, many companies also focus on reinforcement learning and imitation learning.

peng zhihui: you have to look at what each company is good at or wants to do. if we want to use intelligent machines to create unlimited productivity, we have to focus on the full stack. this is a layout based on the competitiveness of the final product.

"latepost": the development pace of your plan is also very fast: one generation of prototypes every half a year, and one generation of products sold every year.

peng zhihui: all innovations must be sustainable, take into account long-term competitiveness, and plan ahead. the speed of innovation is as important as the innovation itself.

latepost: but it is not clear what humanoid robots can be made into. how does zhiyuan determine the iteration direction of each generation of products?

peng zhihui: just like what i just said, think things through clearly, what specific scenarios do we need to do, how do we do them, and what core components and core technologies are needed?

we have researched the needs of many customers, both in the manufacturing and service industries. our shareholders have also provided a lot of input. starting from different scenarios, we have refined the technical paths and found that some are more traditional automation, some are more future agi, and there are some intermediate states. this is the g1-g5 intelligent evolution route we presented at the press conference.

zhiyuan’s current commercial products are at the g2 stage, and it is developing g3 stage systems.

"late": is it really possible to do so many things at the same time and do it quickly?

peng zhihui: in fact, we made the first generation of humanoid robots within half a year of our establishment. first of all, it relied on talents, and everyone was very capable. then there was also my personal experience accumulation, as well as the development of the entire industry and the large-scale application of ai technology.

compared with previous years, it is easier to do this now. the technology stack of robots is very similar to that of new energy vehicles, and the new energy vehicle industry has driven the upstream manufacturing industry. there are some additional advantages to making general-purpose robots in china.

zhiyuan's first generation of humanoid robot yuanzheng a1. peng zhihui said that considering the scene of standing in front of the operating table, the lower body adopts "ostrich legs" with joints bent backwards.

"latepost": the team's combat effectiveness is very strong, how do you measure it?

peng zhihui: it’s about work efficiency, innovation ability, and level of commitment.

latepost: many companies are highly invested and competitive. how can you be better than your competitors in this regard?

peng zhihui: first of all, when we recruit people, we will try to select like-minded people. for example, their values ​​must match the company. i will ask what their interests are? for example, "if you are financially independent, will you continue to do what you are doing in your current position?" in fact, the concentration of up masters in our company is quite high. they are all young people who are passionate about their work and actively participate.

then give enough trust and allocate sufficient resources; encourage full play of subjective initiative in the process, but raise the standards when accepting the results; when the results are good, give enough incentives.

we have many ways to motivate people, including giving out equity in addition to year-end bonuses. our future goal is to be like huawei, giving all outstanding colleagues equity.

latepost: you also mentioned at the recent press conference that robots are a very complex software-hardware collaborative system. you need to do many things at the same time, how do you make the teams collaborate efficiently?

peng zhihui: system architecture design is a skill we value very much. we believe that the growth of personnel must be accompanied by corresponding organizational capabilities, otherwise it may be a disaster. our organizational structure is very flat, with very few levels. above the front-line r&d are the r&d managers of various technical modules, and above them is directly responsible for me. everyone can fully communicate and exchange ideas, and we strongly encourage timely feedback and exposure of problems.

in addition to the regular project development process, we will also promote breakthroughs in some key technologies in the form of project research groups. for core components, we will set up virtual organizations, where members will have more responsibilities and higher requirements, but the incentives after success will also be greater.

"latepost": in addition to recruiting its own developers, zhiyuan also cooperated with peking university to build a laboratory, and even acquired an electrical company, which is something that early-stage companies rarely do.

peng zhihui: we are actually experimenting. for example, the motor team we acquired actually only had two people, who were quite outstanding and had done oem for some big brands, and the feedback was good, so we recruited them.

"latepost": multiple product lines, full-stack technology layout, integrated industrial chain, including your matrix organizational structure, it feels like you are starting a business using the operating methods of large companies?

peng zhihui: you can understand it that way. but we are not trying to act like a big company, but because this approach is the most effective in the current market environment. this is a relatively new but rapidly developing field, and we need to quickly capture the market and demonstrate our technical strength and product vision.

large industries and large tracks will eventually converge on leading companies. if you just play small games, it will be difficult to maintain a leading position in a rapidly changing market.

“i’m not a genius”

latepost: some people say you are a "full-stack overflow engineer" who can solder circuit boards, build robots, write code, train ai models, etc. in addition, you can also play guitar, edit videos, etc. do you have any special learning methods?

peng zhihui: it may be a bit like the way ai learns. in the early stage, it relies on some overfitting, and then uses more training to increase generalization. refer to other people's existing experience more often, and don't ask why first. even if you have to memorize it by rote, write it down first, and you will gradually understand the principles behind it, and then actively innovate. after mastering multiple skills across fields, you will find that there are correlations between them. things in one field may help you explain another field.

time management must be done very well. everyone says i am a master of time management. i like to use a concept similar to that in operating systems, called preemptive scheduling, to dynamically prioritize the things to be done. i don't do one thing after another. if something more important comes up, i do it and then switch back to the previous thing. do only one thing at a time and stay focused in time, and you may be able to use your time more efficiently.

latepost: your method is quite systematic. i asked wang xingxing of yushu technology a similar question before, and he said, “i don’t believe i can’t solve it even if i don’t sleep for 24 hours.”

peng zhihui: i also get a lot of inspiration before going to bed. i think about things during the day and think about them at night. i have also thought of solutions to some problems in my dreams, which is quite magical. but some things may be difficult, like mathematics, if you don’t know how to do it, you don’t know how to do it.

latepost: you mentioned time management just now. even a genius only has 24 hours in a day. you are investing so much energy in starting a business now. do you have to sacrifice a lot in other aspects?

peng zhihui: yes, nowadays, i rarely have weekends. i haven’t asked for leave since i started my business last year. i also have some small habits. i don’t take naps very often and i only eat dinner once a day.

"late": why only eat one meal a day? can you maintain your condition?

peng zhihui: sometimes i am really busy and don’t have much time. i think eating takes up a lot of time. i can eat snacks when i am hungry, and my body won’t have any negative feedback. i think it is not a bad habit as long as we can achieve a balance.

"latepost": after more than a year of entrepreneurship, what is the most impressive thing for you?

peng zhihui: too many. for example, when we held the first press conference in august last year, it was extremely extreme in every sense. more than two weeks before the press conference, our r&d colleagues basically slept on the floor at the conference venue.

for the press conference, we chose the most challenging live broadcast format. after i had been speaking on stage for 20 minutes, i saw the ok gestures from the audience. we had finally solved the last few bugs.

"late": why can't we postpone it?

peng zhihui: it has been delayed before, and this is the time after it has been delayed twice. i still have to push myself.

latepost: you used to work on projects alone, but now you have to lead a team. how do you learn management?

peng zhihui: on the one hand, i learn from experienced seniors. on the other hand, i learn through actual practice. at the same time, i also have some partners who help me with management work.

latepost: do you still read books? what kind of books do you read?

peng zhihui: they are all on my desk, various papers and books. i think the only novel i have read is the three-body problem. i like hard science fiction and am a big fan of big liu. i have read the three-body problem more than 10 times.

"latepost": when you meet liu cixin, what will you say to him?

peng zhihui: give me an autograph.

latepost: there should be a lot of people asking you for autographs now. what are you thinking when you encounter such a situation?

peng zhihui: if i had known this earlier, i would have practiced calligraphy more.

latepost: you have many labels, such as a famous up host on bilibili, a genius boy, a co-founder of a star company, a wild iron man, and a full-stack overflow engineer. which label do you prefer?

peng zhihui: personally, i think i would be an engineer. i think technology is still very interesting. if i become financially independent in the future, i may not necessarily make videos, but i will definitely continue to work on technology.

"latepost": what do you think of the title of "genius boy"?

peng zhihui: i don’t think i’m a genius. there’s no need to call me that. i’m just interested in it and willing to try to make something. i happened to meet a relatively good era, various platforms, and opportunities, and i made some achievements. it’s mostly hard work and luck.

"late": what kind of person do you think can be called a genius?

peng zhihui: their existence can have a great impact on the development trajectory of the entire world, such as the scientists we see in textbooks.

"latepost": is it considered a genius if one creates a true humanoid robot?

peng zhihui: i will give this title to the robot we built.

title image source: peng zhihui.