news

Dialogue with Ideal Intelligence students Lang Xianpeng and Jia Peng: How come a poor student handed in his paper early?

2024-08-02

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina



Intelligent driving, an expensive competition has begun. It determines the ranking and the future.

Text丨Cheng Manqi and Dou Yajuan
Edited by Song Wei

Dr. Gu Junli, who worked at Tesla and Xiaopeng, said that China's smart driving research and development progress lags behind Tesla by at least 1.5-2 years. Lang Xianpeng, vice president of Ideal Intelligent Driving, believes that the gap is not that big, and Ideal lags behind in product experience by at most half a year.

Ideal emphasized that its advantage in smart driving is that it has a large number of cars and a lot of data. He Xiaopeng, founder of Xpeng, said: "If someone says that it has a lot of cars and a lot of data" and can do autonomous driving well, "don't believe it, it's absolutely nonsense."

The price war is still burning, and China's new automotive forces have collectively squeezed into a new battlefield - smart driving, which is full of differences, disputes, and competition.

Not all car companies can afford this ticket. Intelligent driving R&D investment started at 3 billion yuan and increased year by year. Ideal said that it costs 1 billion yuan to rent a card a year now and 1 billion US dollars in the future.

The new forces are so crazy and unwilling to fall behind because they have seen the great progress of Tesla FSD V12 (a new version of fully autonomous driving that Tesla will start to push on a large scale in January 2024), and the impact of intelligent driving capabilities on consumer decisions. Last September, Huawei announced that it would launch a map-free solution that can be driven nationwide by the end of the year, and Wenjie vigorously promoted intelligent driving at the same time. In just one month, Wenjie, which had been hovering in the thousands of sales, had a monthly sales volume of over 10,000, and by the end of the year it had reached the 30,000 mark.

Shortly after Huawei announced its radical intelligent driving plan, Ideal held its 2023 Fall Strategy Meeting, making it clear that intelligent driving is its core strategy and it must not lose. CEO Li Xiang said, "We will become the absolute leader in intelligent driving by 2024."

Since then, Ideal has accelerated its iteration and made progress on two fronts: on the one hand, it used NPN (Neural Prior Net, a priori neural algorithm that uses prior information of some roads and maps to help identify road features and reduce reliance on high-precision maps), and finally launched NOA in 100 cities at the end of last year; at the same time, it started pre-research on NOA without map in October last year, started internal testing with thousands of people four months later, and launched the full version in July this year.

There is no time to breathe this summer, and the new forces are engaged in the next battle: end to end, a technical term that most consumers are not familiar with, has become a battleground.

The significance of end-to-end is that it brings intelligent driving research and development into the AI ​​era - no longer relying on a lot of manual programming, as long as more data is used to train the model, the system will continue to become stronger and may outperform human drivers. Musk believes that this brings humans a long way closer to fully autonomous driving.

Ideal launched a 1,000-person internal test of its new "end-to-end + VLM (Visual Language Large Model)" architecture this week, calling it a more advanced one model and the world's first dual system. One model means that both the perception and decision-making modules of autonomous driving are completed by one model, with sensor data as input and driving trajectory as output.



Autonomous driving has three modules: perception, planning and decision-making, and control. It relies on perception to "see", decision-making to "think" about how to drive, and control modules to complete driving behaviors. End-to-end technology is from perception to decision-making, and the entire process is implemented using a large model.

Around this time, NIO announced the mass production of end-to-end AEB (emergency braking function) in early July; Xiaopeng reiterated this week that it is the only two car companies in the world to achieve end-to-end large-scale mass production, the other one is of course Tesla. If suppliers are included, Huawei and Momenta have also completed end-to-end vehicle installation this year.

Ideal only started to develop its own intelligent driving in 2021, two years later than Weilai and Xiaopeng. Ideal's current progress is like a poor student who suddenly knows the answer and hands in the paper ahead of time.

At this moment, we talked to Lang Xianpeng, Vice President of Ideal Intelligent Driving, and Jia Peng, Head of Ideal Intelligent Driving Technology R&D. They explained how all this was achieved.

Lang Xianpeng is a number one in intelligent driving who likes to name key projects after Greek mythology. He is a doctor of pattern recognition and intelligent systems. The battles he has completed at Ideal include "Acropolis", "Iliad" and "Titan". In 2018, Lang Xianpeng joined Ideal from Baidu as director of autonomous driving and was later promoted to vice president.

Jia Peng is a young technical research and development leader who was one of the first people working on intelligent driving at NVIDIA China. He saw that the chip giant was the first to propose an end-to-end, large-scale model for autonomous driving, but found that only car companies could actually implement these.

The companies that have committed themselves to end-to-end development have different roadmaps and progress, but they share one ambition and one technical direction: to ultimately achieve Level 4 autonomous driving.

We see today's enthusiasm for smart driving and end-to-end driving, which is due to both belief in technology and competition, for the sake of user minds and sales charts.

This is an expensive competition. The cost is not only the huge expenses of hiring people, buying GPUs, and training models. Before L4 is truly realized, people are still sitting in the driver's seat. Safety, reliability, and stability are the user's test standards for current intelligent driving.

The poor students hand in their papers

"Late Post": Ideal only started to develop its own intelligent driving in 2021, later than Xiaopeng and Weilai, and has always been catching up. Until this year, it directly switched from NPN to NOA without map, and then started the end-to-end internal test with thousands of people this week. Some people commented, how come the latecomers suddenly handed in their papers ahead of time?

Lang Xianpeng: It might be a loser’s counterattack.

We have developed three generations in the past year, from map-based to “prior information” NPN, to map-free. In June this year, we verified the end-to-end architecture and proposed a fast and slow system architecture. The fast system is end-to-end and is capable of quickly processing information for daily driving; the slow system is VLM (visual language model), which is capable of dealing with complex scenarios.

Moreover, our end-to-end is one model, the input is the sensor, the output is the driving trajectory, all implemented by one model, without any rules in between. Except for Tesla, other car manufacturers only implement end-to-end in a certain link.

LatePost: Your first key progress - from NPN solution to no-map solution - started verification in October last year, internal testing in February this year, and full release in July. It only took 4 months to complete the switch, which sounds incredible. How did you do it?

Lang Xianpeng: We are more efficient and faster than others. For example, we save a lot of decision-making processes. From deciding to do it, to deciding the plan, to pulling the team together, it may only take a week. If it is a traditional car company, it may take 3 months to start the project.

"LatePost": What did you give up for this?

Lang Xianpeng: Maybe it's a personal break. Everyone knows the company's goals, and we have no way out.

Jia Peng: I'm used to it. I left NVIDIA and joined Ideal in 2020. The environment we have been facing is that we are poor students and are scolded by parents every day.

"LatePost": Is this parent Li Xiang?

Jia Peng: It’s the user.

"LatePost": It seems that the direction of your intelligent driving technology route is very clear - it is to learn from Tesla. So how do you learn it specifically?

Lang Xianpeng: People think that technology development takes time, but what is often needed is not time for development, but time for trial and error. Tesla is indeed a good benchmark. If they fail to find a way through trial and error, we will not go there.

The evolution and iteration of Tesla's FSD has shown us that it is possible to succeed without a map. Should we choose NPN or without a map? Since Tesla has done it, we chose without a map, so we made the switch in a few months.

But the biggest inspiration from Tesla to us is how to go from 0 to 1, and from 1 to 10 in the development of autonomous driving. Tesla also used the solution of its supplier Mobileye for intelligent driving at the beginning, but soon found that the supplier could not meet its requirements, so it started its own research in 2016, and experienced a period of fluctuation in the middle, and finally achieved the effect of Mobileye. In 2019, it developed its own FSD chip, and had hardware to support its AI research and development. Since then, it has appeared end-to-end, and it essentially uses AI capabilities for intelligent driving.

"LatePost": The core of V12 is end-to-end. In fact, the V11 version pushed by Tesla in early 2023 has achieved picture-free. Why didn't you learn it directly at that time?

Lang Xianpeng: It’s like everyone thinks advanced mathematics is important, but if you don’t know the four arithmetic operations, how can you learn advanced mathematics well?

I have discussed this with Wu Xinzhou (former head of Xiaopeng Intelligent Driving). We both believe that the whole process can be accelerated, but it cannot be skipped. Everyone is doing end-to-end, but from map, NPN, no map to end-to-end, every step cannot be skipped. By skipping these steps, you actually skip a lot of technical understanding.

If we had not attempted to do the 100-city NOA in the second half of last year, we would not have had such a clear understanding that the NPN was not working. Just from the scale, there are only 300,000 to 400,000 kilometers of highways in China, but there are millions of kilometers in cities. If we want to spread it across the country, this map is simply impossible to complete.

"LatePost": But you said before that the big judgment is not a question of whether you can do it, but a question of whether you dare to do it.

Lang Xianpeng: It’s not impossible to do it. If we really want to do it, we will have to fight for resources. Anyway, we will need thousands of people to expand the scale.

Jia Peng: We joked internally that this path would eventually turn us into a mapping company.

"Late": So what did you rely on to accelerate later?

Lang Xianpeng: Organizational efficiency has always been an ideal advantage. From NPN to no-map, and then to end-to-end, these were all big changes, but we made the switch as soon as we said it.

The efficiency of cooperation between R&D and delivery is very important. Technology needs to break through the upper limit. The difficulty lies in making choices, but after making the choice, delivery is responsible for improving the lower limit. At the company's strategic meeting in the second half of last year, Li Xiang clearly proposed that RD (R&D) and PD (mass production delivery) should be done together. After the R&D ideas were clear, there are always two lines of PD and RD in our team. In November and December last year, we worked on the mapless version. By January this year, it was almost ready for delivery. RD immediately switched to PD, and gave version 5.1 in February, and continued to deliver it. Now it is version 5.2, and then Beta 1, Beta 2, Beta 3, to polish it.

Jia Peng: I think it is rapid trial and error. Our process is: find a closed area, verify the paradigm in a short period of time, first reach the upper limit of this paradigm, and once a region is running smoothly, immediately expand outwards, add a safety bottom-line strategy at the same time, and then slowly roll it out. Test whether this paradigm works across the country, and if it doesn’t work, quickly add data and change the strategy. When it comes to product acceptance, from bird eggs to early birds to internal testing with thousands of people, we let users work with us to test and iterate the product.

"LatePost": It sounds very risky. How could you be sure that this process would work?

Lang Xianpeng: The risk is huge, but we have always been doing this.

Our first car, Ideal ONE, used Mobileye's intelligent driving solution. Later, when the Ideal ONE was about to be delivered, Mobileye said it would not cooperate and could not deliver white boxes. It was already 2021, and we thought that if we did not master assisted driving technology at this time, it would definitely not work. So we made a difficult decision - to do it ourselves. If we can't do it, it's because we are not capable enough. But if we give up today and still use suppliers, then we may have no future.

We were "forced" to find a very different R&D process. We had to deliver in May and produce a prototype in March. On May 25, 2021, the day before the Ideal ONE launch, we still had a lot of bugs to fix, which were finally fixed on the morning of the launch. This is the prototype of our current process: first verify on a small scale, then improve capabilities, fix bugs, and stabilize quality.

At that time, the team had only 100 people, and 40 of them left in the first month. Someone told me, "How can we deliver something in three months when others take one or two years to deliver? Don't fool yourself."

"Late Post": Both are without pictures. Last year, Xiaopeng opened cities more slowly than you this year, and had more testers. Xiaopeng said that every time it opened a city in a place, it would conduct at least four rounds of field tests to ensure safety and not open blind boxes for users. How do you ensure safety with your method of rapid R&D, delivery, and then internal testing from bird eggs to thousands of people?

Lang Xianpeng: The evaluation method of autonomous driving system is very different from that in the past. In the past, intelligent driving was to design functions first and then develop them, and then test the functions one by one to verify them. But now data-driven autonomous driving is centered on capabilities rather than functions, and "capabilities" can only be evaluated through "exams".

We use the world model + shadow mode for testing. The world model reconstructs and generates real scenes, and the car runs in it, which is equivalent to a mock test to evaluate the ability in the R&D process. After passing the mock test, we use early bird and internal test vehicles and shadow mode to do real car tests. If you fail the test, we will continue to iterate until you pass.

"Late": If the composition topic has an answer, then ideally one can run faster than others, but the answer will not always be there, and most technical questions may be open-ended questions.

Lang Xianpeng: What you see today is that we are catching up relatively quickly in the so-called essay writing, but after we catch up, we may be even faster because the entire system has been built.

It doesn't mean that if we start self-development in 2021, we can deliver products that are worse than our peers. From the first day of delivery, we have to compare our grades with the best students in the class. This also means that if I use the other party's learning method to learn, I will definitely not be able to learn better than him. So we have to do things in our own way.

On the brink of no man's land

"LatePost": End-to-end is not a new concept. Nvidia and Waymo both proposed end-to-end a few years ago, but why is it Tesla that has accomplished and promoted this?

Jia Peng: Because it not only proposed technical ideas, but also showed everyone the effect of use.

Lang Xianpeng: Many people at Tesla see because they believe, but more people believe because they see.

"LatePost": If Tesla hadn't led the way, would Ideal have fallen behind even longer?

Lang Xianpeng: We were late in algorithm development because we did not have enough conditions and resources. But we were not late in data accumulation and R&D system construction, so we were able to catch up.

From the beginning, we knew that Tesla's philosophy - data-driven is right, so we built the R&D foundation according to it. On the first generation of Ideal ONE in 2019, we built a data closed-loop system - Poseidon, a tool chain for collecting, mining, labeling, and training data. We didn't have the resources to develop our own data at the time, but we also put an extra camera next to Mobileye's camera to collect and analyze problems.

For example, if a problem occurs during a road test, the traditional method is for the on-board personnel to write it down and then drive to the same scene to reproduce it. When we encounter a problem, the data can be synchronized back to the background. Before the test is over, the data has been analyzed and the problem has even begun to be solved. What traditional companies need several days or even a week to do, we can probably complete in one hour.

In terms of data accumulation, the total mileage of Ideal users using autonomous driving has exceeded 2 billion kilometers, of which nearly 1 billion kilometers were used by NOA. Tesla started earlier, has a larger number of vehicles, and has a longer mileage.

"LatePost": Is this more of Li Xiang's persistence or yours?

Lang Xianpeng: We agree. When I came to Ideal for an interview in 2018, Li Xiang asked me, what is the main problem to be solved in order to finally achieve L4? I said it is data - without a closed-loop data system, the analysis efficiency is not high, whether it is samples or problems. People can be mined, algorithms can be developed, but if the data problem is not solved, it will definitely not be done well.

LatePost: NIO has just started mass-producing end-to-end AEB; Xiaopeng said at a press conference this week that they are the only two car companies in the world that have self-developed and mass-produced end-to-end technology, and the other one is undoubtedly Tesla. What are the differences between the end-to-end systems of each company?

Jia Peng: The architecture of Xiaopeng 5.2 is similar to the map-free model that we just launched in July. Perception is a model, decision-making is a model, and they are connected in the middle. They have just completed this. ADS 3.0 that Huawei has released is also segmented end-to-end.

Tesla uses one model from perception to decision-making. Our latest version also integrates perception to decision-making into one model, and has started testing with thousands of people this week.

LatePost: What is the difference between end-to-end and segmented end-to-end perception and decision-making one model? Is one leading the other?

Lang Xianpeng: It depends on what the goal is. The segmented type is more suitable for L2+ assisted driving, and the one model can truly achieve L3 and L4 autonomous driving.

Although segmented end-to-end replaces some rules with data-driven decision modules, there are still rules in the entire process. It is essentially similar to the previous intelligent driving architecture and R&D process, and is still divided into modules. However, the one model does not contain any rules. Sensor data comes in and the planned trajectory comes out. It is purely data-driven.

"LatePost": Can you explain in one sentence what the greatest value of end-to-end is?

Jia Peng: From the user's perspective, the driving behavior is more human-like, and the detailed control is smoother. From the R&D perspective, the iteration efficiency is higher.

Lang Xianpeng: End-to-end is the first time that pure data has been used to drive autonomous driving. The research and development method has changed from starting from functions and scenarios to improving system capabilities. We have truly entered the era of artificial intelligence: as long as the system continues to become stronger, it will perform beyond expectations.

"LatePost": How to train a smarter model in a shorter time?

Jia Peng: Data, especially high-quality data, is very important. We selected the best data from 20 billion kilometers of data from 800,000 car owners and trained with more than 1 million kilometers of data, which exceeded 5 million kilometers by the end of the year.

The second is the training method. On the basis of imitation learning, we added reinforcement learning to let the model know what is wrong.

Lang Xianpeng: Finally, it’s about computing power. Ideal has GPUs with computing power equivalent to 5,000 A100 and A800. If you rent cards, it will cost 1 billion a year, which requires healthy profits to support.

"LatePost": You have repeatedly emphasized that we can catch up because we have data, but this week He Xiaopeng said, "If someone says that they have a lot of cars and a lot of data" and can do autonomous driving, "don't believe it, it's absolutely nonsense."

Lang Xianpeng: We also hope that everyone can treat the product objectively. But now is still the era when Edison and Tesla proved whether direct current or alternating current is better. One person used alternating current for electric torture, and the other demonstrated that it was okay to use alternating current to pass through the human body.

"LatePost": Tesla has the most data and the largest investment in computing power. Does this mean it cannot be surpassed?

Jia Peng: Tesla’s current limitation is hardware, because the computing power of HW 3.0 (Tesla’s third-generation smart driving hardware) is 144 TOPS, and the model parameters that can be supported are not particularly large. If too much data is added, “catastrophic forgetting” will occur. This is why after the V12.4 update, some scenes have improved, but some have deteriorated, such as random lane changes in open scenes.

"LatePost": But from another perspective, the fact that FSD can run smoothly on HW 3.0, which was installed in cars in 2018, shows that Tesla has a strong ability to combine software and hardware.

Jia Peng: It is indeed very strong. But I think there are challenges for FSD to enter China. First, most roads in the United States are relatively simple; second, Tesla can obtain road topology information in the United States, which is not available in China. So FSD is actually map-less, and we are really map-free, without any prior map information.

LatePost: In July this year, Dr. Gu Junli, who worked for Tesla and Xiaopeng, said, "Tesla's R&D progress is 1.5-2 years ahead of domestic smart driving." Do you agree?

Lang Xianpeng: I don’t quite agree.

The map-free version represents the upper limit of rules. The end-to-end version represents the upper limit of data-driven. There are no rules in the middle, just a model. However, map-free and end-to-end cannot achieve autonomous driving, because they are still solving long-tail problems and cannot handle situations that have never been encountered. To reach L4, the system must learn to handle unknown scenarios. We believe that this capability must be solved by VLM, not end-to-end.

Therefore, our new architecture is end-to-end + VLM. The former corresponds to the fast-thinking system 1, which handles most driving scenarios that require quick responses; the latter is the slow-thinking and long-term decision-making system 2, which can learn some common sense to deal with unknown situations, such as identifying unconventional traffic lights that have never been seen, various forms of tidal lane signs, features around schools, etc., and tell the car in advance not to enter or to slow down.

System 1 + System 2, Ideal is the first to build this architecture.

Jia Peng: Judging from public information, Tesla’s current technical architecture does not have VLM.

"LatePost": Lingo-2, released in April this year by Wayve, a British autonomous driving company invested by Nvidia and SoftBank, also added a large language model on the vehicle side. Were you inspired by Wayve?

Lang Xianpeng: It does not have System 1. Wayve's Lingo-2 and cloud models are both multimodal large language models, similar to VLM. Its idea is to use one model to solve System 1 plus System 2. But when it comes to mass production, it will be found that Orin's computing power is there and cannot support the large model of System 2. Wayve can do it because it is not a mass-produced car, and it has a server on the back of the car to run Lingo-2.

Jia Peng: We were first inspired by Google's robot systems RT-1 and RT-2, which are VLA (Visual-Language-action) models, and the final behavior is also output by the model. It may be an end: if my hardware is good enough, I can theoretically run VLA in real time.

"LatePost": So the inspiration does not come from the automotive industry, but from robots?

Lang Xianpeng: Because we regard autonomous driving as a typical application of artificial intelligence. This dual-system solution actually proposes a universal embodied intelligence architecture, which is autonomous driving on the car and intelligent robot on the robot.

LatePost: The "end-to-end + VLM" architecture you proposed was inspired by Tesla and Google RT, and the VLM paper was co-authored with Tsinghua University. Does this mean that you are more accustomed to combination innovation at this stage?

Lang Xianpeng: When I was working with Professor Zhao Xing from Tsinghua University, we had a clash of ideas, and it was not that he proposed the ideas and then we implemented them.

LatePost: You regard autonomous driving as part of general embodied intelligence. Does it also have scaling laws, and do you believe in scaling laws?

Lang Xianpeng: The end-to-end scaling laws will not be particularly obvious because the parameters are limited. They may be fully fed with tens of millions of data, and then they will start to be forgotten if more data is added. We can already see this phenomenon in Tesla FSD V12.4.

But VLM's Scaling Laws definitely exist, and it can scale to tens or even hundreds of billions of parameters. As long as there is enough data and the parameters are large enough, the performance will increase. This path is very attractive to us.

"LatePost": If VLM can run fast enough in the car and the latency is low enough, then is System 1 no longer needed?

Jia Peng: In theory, yes. Now our VLM can achieve 3.4 Hz on the car (Note: Hz is the number of periodic events per unit time. The larger the value, the smaller the delay). It is a 2.2B (2.2 billion) parameter model, but to be able to replace end-to-end, it needs to run at more than ten Hz, corresponding to a delay of 100-200 milliseconds, which is the reaction speed of humans. Some scenarios have even higher requirements for delay, such as AEB (emergency braking).

LatePost: How unique is this architecture? Huawei is also talking about System 1 and 2; Xiaopeng's "Big Language Model XBrain" also handles unknown scenarios. Is it similar to the System 2 you mentioned?

Lang Xianpeng: We are the first in the industry to propose a dual system; and our VLM is deployed on the mass-produced vehicle-side chip Orin X. Other companies' previous similar attempts were on industrial computers.

Whether it is the end-to-end one model or VLM, this architecture has been delivered and is being tested on thousands of people.

"LatePost": You also mentioned that you are working on a world model in the cloud. What role does this play in the entire architecture?

Jia Peng: This is our system 3. The cloud world model does two things: First, VLM can be distilled using the cloud world model, that is, first train a super large model on the cloud, such as Meta's recently released 400 B parameter Lamma 3.1, and then distill an 8 B model, which is better than training an 8 B model from scratch.

Second, the world model can test the capabilities of System 1 and System 2. In the process of doing end-to-end mapless work, we found that nationwide verification is very difficult. There are 10 million kilometers of roads, and we could only lay out manpower to test them before.

LatePost: Tesla is also researching world models. But does the industry need so many world models? After all, we only have one world.

Lang Xianpeng: In the process from 0 to 1, there will be many routes and attempts, just like we don’t need so many electric car brands, but there are hundreds of them at the peak.

LatePost: Previously, the industry believed that the ranking of China's intelligent driving was Huawei, Momenta, Xiaopeng, and Ideal. When will this ranking be rewritten? And what will be the next turning point to change the ranking of intelligent driving?

Lang Xianpeng: It has been rewritten. In the future, each team will reach the no-man's land: map-free will make it possible to drive across the country, end-to-end will make it possible to drive across the country well, and the next step will be L4.

How to mass produce L4? At the beginning, there will definitely be a hundred flowers blooming, and then convergence. But everyone will not return to the same starting line, because the gap in data and computing power will only get bigger and bigger.

Review of Ideal Intelligent Driving’s six key battles

"LatePost": I heard that you are good at naming battles.

Lang Xianpeng: We take naming seriously.

The intelligent driving team has fought six key battles. The first battle was the Acropolis Project, followed by the Iliad Project and the Odyssey Project, the first and second parts of Homer's epic poem; followed by the Titan Project and the Golden Apple Project. After the Titan War, the new gods defeated the old gods. Then there is the Damocles Project, which is an end-to-end project. This project is challenging and dangerous. If it is not done well, the sword of Damocles will fall.

"LatePost": What are the biggest challenges and gains encountered in each battle?

Lang Xianpeng:

  • The Acropolis Project is our first self-developed project - delivering basic functions such as AEB, ACC adaptive cruise control, and lane keeping on the Ideal ONE released in May 2021. These technologies are very mature, but we were only given 90 days, so we had to fight for strong execution. From that day on, we thought about how to catch up with others quickly.
  • In 2022, we started the Iliad Project - delivering the Orin X project on the L9 model. The algorithm previously used on the Horizon J3 is no longer applicable, and we have to redevelop the system on Orin. The epidemic also hit, and the chip supply was cut off. Bosch could not provide enough angular millimeter-wave radar chips. We had to make a choice to remove the angular millimeter-wave radar and use a pure vision solution for blind spot detection, obstacle avoidance and other functions. In the end, it took three months to deliver the solution, which was several months earlier than the time when our competitors delivered Orin.
  • At the same time as Iliad, Jia Peng was responsible for developing the Pro platform based on Horizon J5, which is the Odyssey plan. The biggest challenge was the small number of people. At that time, the entire team had only 500 people. In 2021, Xiaopeng and Weilai both had thousands of people, and Huawei claimed to have more than 2,000 people at the time.
  • In 2023, our Orin platform will be relatively stable and have caught up with the hardware. We believe that the next battle will be against the city NOA. Only those who can win will be qualified to enter the first echelon. This is called the Titan Project.
  • The Golden Apple Project is the Hundred Cities NOA proposed at the 2023 Shanghai Auto Show. It also comes from Greek mythology. It is about Hercules who went to find the golden apple, but the golden apple was guarded by a hundred giant dragons. If we want to get the golden apple, we have to chop off the dragons' heads one by one and eliminate the hundred cities one by one.
  • Project Damocles is an end-to-end project that began development this year, which means that if it is not done well, the Sword of Damocles will fall.

"LatePost": Other companies have not removed the four corner millimeter-wave radars yet. Have you considered the impact on the security of the system after removing them?

Lang Xianpeng: We removed the millimeter-wave radar for two reasons. One was to ensure delivery. At that time, Bosch's corner radar chips were out of supply, so we had to make a choice. Either replace the radar with vision, or we couldn't deliver it. The second was a technical choice. At that time, Tesla wanted to go with a pure vision solution, and vision is closer to the human ability to recognize the surrounding environment. If there are both corner millimeter-wave radars and visual sensors on the car body, when there is a disagreement between the two, it is necessary to use human-written rule logic to judge it, and it is inevitable that there will be errors.

An additional benefit is the reduction in technology costs, saving approximately RMB 500 million.

However, using several cameras to replace the corner millimeter-wave radar is technically difficult and risky. We did a lot of testing, and the final result is that the accuracy and success rate are slightly higher than those with corner radar.

LatePost: You mentioned the problem of insufficient resources before. Has this problem been solved now?

Lang Xianpeng: We proposed the "three major strategies" at the autumn strategic meeting last September. The first strategy is the smart driving strategy. So we started to recruit a lot of people in the second half of the year. The company's requirements and expectations have also increased. Whether it is 100 cities or other places, we need to catch up with the top echelon.

"LatePost": So intelligent driving was not an ideal core strategy before?

Lang Xianpeng: This time it has been officially clarified.

"LatePost": Is this because you realize that the impact of intelligent driving on product sales is widening the gap with Huawei?

Jia Peng: Yes, so the autumn strategy for 2023 will determine that Ideal will become the absolute leader in intelligent driving this year, because we judge that the car-buying logic of the entire industry will become intelligent driving first.

"LatePost": After six battles, what have you accumulated?

Lang Xianpeng: If you want to win, you have to think in terms of how to win. That is, start from the end, find the necessity, and think clearly about what needs to be done to solve a problem. Removing corner radar and NPN cutting are examples.

LatePost: Isn’t ideal based on competition? For example, last year’s 100-city project race.

Lang Xianpeng: Last year, after Huawei announced that it would launch ADS (Huawei's NOA solution) that would be available nationwide, we overemphasized competition and benchmarking against Huawei's indicators, such as takeover rate, and neglected user experience. This was also a point of criticism at this year's spring strategy meeting.

Later we reflected that all product acceptance and delivery should be based on user evaluation.

"LatePost": How do you design intelligent driving R&D and product organization to cope with today's high-intensity competition?

Lang Xianpeng: Our smart driving is a horizontal and vertical organization. I am responsible for the vertical business department, doing R&D and delivery. But the organization, execution and operation of the final product, including external competitive benchmarking and R&D resource investment, are all managed by the smart driving PDT (Product Development Team, a cross-functional product development team).

I will participate in the formulation of some talent strategies and plans, and once the plans are finalized, we will implement them firmly.

LatePost: Last fall, Ideal recruited a large number of people, and the intelligent driving team expanded from more than 700 to more than 1,000. In May this year, it laid off two or three hundred people, and in June, it recalled some key employees. What does it mean that it went from recruiting to laying off and then to recalling in a short period of time?

Lang Xianpeng: The essence is the iteration of technology. In the past, there were a lot of rules in the intelligent driving system, which required manual programming, progress management, and testing. But the end-to-end system is more of an AI model, and the above positions have been greatly reduced. Later, a few people were recalled, and most of the adjustments were based on business needs. In fact, Tesla's intelligent driving team has always been 200 to 300 people, and has delivered the world's largest autonomous driving fleet.

LatePost: Tesla's end-to-end concept was first proposed by Dhaval Shroff, an Indian technician, and was adopted from the bottom up. Does an ideal R&D organization have the soil for bottom-up innovation?

Lang Xianpeng: Actually, the ideas for VLM came from our pre-research and development team. We did not plan such a dual system very early.

LatePost: How do you evaluate your talent pool? Xiaopeng had Wu Xinzhou before, and NIO had Ren Shaoqing. Some people think that the Ideal Intelligent Driving team has always lacked such a technical expert.

Lang Xianpeng: At this level, both technical ability and the ability to produce results are important. Many of our technical leaders, including me, Jia Peng, and Wang Jiajia, have been working on autonomous driving since 2014 and 2015. The new recruits we have recruited are also quite strong. Most of the more than 200 graduates this year are from the top 50 of QS100 (QS World University Rankings in the UK). In addition, we have computing power and data reserves, which are the soil for talent growth.

"LatePost": Although you entered the field of intelligent driving very early, you initially worked on map-related algorithms at Baidu, not intelligent driving itself.

Lang Xianpeng: The experience at Baidu is very important. That experience made me not afraid of anything in management. I believe that if I find the right method, I can achieve better results in a shorter time.

My first project at Baidu was similar to Ideal's first generation of self-developed projects, both of which had very tight deadlines. I joined Baidu at the end of April 2013, and four months later, we had to launch the Street View project at the Baidu Conference. The team had only four people at the beginning, and we finally completed the launch at midnight the day before the conference.

There are two key points here. One is to use new technology. When making street scenes, license plates and faces need to be blurred. The conventional method at that time was to do it manually; but we used visual algorithms, which are faster, more accurate, and save a lot of manpower. The other is data. We originally wanted to cooperate with the team of Yu Kai (who later founded Horizon Robotics) and Ni Kai (who later founded Hodo) from Baidu IDL on this algorithm, but their products only had an accuracy rate of 86% in this scenario. Later, we achieved 99% for license plates and 97% for faces. The key is that we labeled tens of thousands of data.

We are definitely not as good as them in terms of algorithms, as they are the best group of people in the world in terms of algorithms. But the difference is only 80 to 90 points; and in terms of scene data, we have an order of magnitude more. So later in the interview, Li Xiang asked me, what is the most important problem in solving autonomous driving? I would say it is data.

"LatePost": In the past few years, many people chose to resign because they could not withstand the pressure or did not believe that their ideals could be achieved. Why did you stay in the end?

Lang Xianpeng: Our group of people just want to make L4 a success, and I think this can only be achieved with ideals.

Jia Peng: I worked at NVIDIA for 5 years before I came to Ideal. NVIDIA was the first to come up with end-to-end and large models, but they were not implemented at that time. After I came to the car company, I finally had the opportunity to make autonomous driving a closed loop, which was very exciting.

Source of title image: "Bad Genius"