news

After the success of Carrot Run, there are opportunities for startups worth hundreds of billions in the AI ​​smart driving era

2024-08-09

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Source: Visual China

Special author: Xu Zhen, an investor of Xiangfeng Evergreen

Editor: Xue Fang

Produced by: Deepnet·Tencent News Xiaoman Studio

It seemed like overnight, driverless cars suddenly became popular.

In July, Baidu's "LuoBo KuaiPao" driverless taxis were widely deployed in Wuhan, making news almost every day.

According to reports, the daily peak of Turnipao's single-vehicle service exceeds 20, which is close to the average daily service volume of taxis, and the passenger experience is better and more comfortable. According to insurance data, Turnipao's accident rate is only 1/14 of that of humans, and it has traveled safely for more than 100 million kilometers without any major casualties.

On the other side of the ocean, Musk also announced that he would releaseTeslaRegarding the first driverless taxi Robotaxi (later postponed to October), he said that car owners can put Robotaxi into rental business in their spare time and recover the purchase cost within two years.

Coincidentally, on July 23, Google also announced a new round of investment of US$5 billion in Waymo to "maintain Waymo's position as the world's leading autonomous driving company." It should be noted that Microsoft only spent US$1 billion to invest in OpenAI that year, and US$5 billion is close to the total amount of Waymo's previous financing.

In fact, the explosion of driverless cars today is no accident. From the perspective of the industry, the past three years have also been the fastest three years for the development of autonomous driving technology, which can almost be said to be "advancing by leaps and bounds".

Judging from the current progress of China and the United States, if we put aside the constraints of legislative standards and time, driverless cars will most likely be put into full commercial use within the next 1-2 years, and reach an end-to-end "complete form" in 5-10 years, realizing a driverless form similar to that in science fiction movies.

From this perspective, although it may be too early to talk about the "iPhone 4 moment" of driverless cars, the singularity of the era of driverless cars has indeed arrived.

At the same time, this is also a race against time: China and the United States are the two countries with the largest number of autonomous driving companies in the world. Whoever can be the first to run through and establish the relevant industrial chain will have the ability to define the track and export technology products to other countries.

Against the backdrop of Sino-US technological competition, this is a "battle that cannot be lost" for either side.

In this article, we will start with the history of autonomous driving and try to speculate on the future development trend of the industry. First, we will share a few preliminary conclusions:

1. From high-precision maps and lidar to BEV: all paths in the evolution of autonomous driving are aimed at "making cars behave more like humans."

2. The past three years have also been the three years with the fastest development of autonomous driving technology. A large number of players at home and abroad have achieved end-to-end to varying degrees. Looking to the future, it is only a matter of time before the ultimate form of overall end-to-end is achieved.

3. Whether it is pure vision or lidar route, millimeter-wave radar is the best means to make up for the technical shortcomings of both. As millimeter-wave radar evolves from 3D to 4D and then to imaging millimeter-wave radar, the continuously improving accuracy also gives this track the opportunity to give birth to high-value companies.

4. The implementation and commercialization of autonomous driving may just be the beginning. In the future, more tracks and products (such as robots for different scenarios) may replicate similar stories after accumulating enough data.

5. Throughout history, every time a technology route is iterated, there is an opportunity for a group of startups to rise, and I believe this time is no exception.

Why is it said that the past three years have been the fastest three years for the development of autonomous driving?

The generally recognized origin of autonomous driving is 2004, when the United States was mired in the wars in Afghanistan and Iraq and urgently needed a batch of military unmanned vehicles to reduce casualties among U.S. troops, but the research and development progress was always unsatisfactory.

So, Tony Tesse, then director of the Defense Advanced Research Projects Agency (DARPA), had a sudden idea and set up a challenge, announcing that whoever could reach Las Vegas from Los Angeles within 10 hours using an unmanned vehicle would win a $1 million prize.

This was actually a large-scale bidding by the US military against civilians. The organizers originally thought that few people would participate, but unexpectedly, more than 100 teams signed up. Unfortunately, in the end no one could take away the prize money - because the farthest participating car only traveled 12 kilometers, which is about 5% of the entire race.


(Photo: In addition to cars, there were also self-driving motorcycles participating in the competition...)

But DARPA did not give up. It held three consecutive autonomous driving challenges from 2005 to 2007, attracting research teams from countless universities and companies to participate. This also allowed Larry Page, one of the founders of Google, to see the potential of autonomous driving.

In 2009, under the promotion of Page, Google's self-driving project "Chauffeur" was officially launched. The two core engineers recruited (Anthony Lewandowski and Sebastian Thrun) were both former DARPA contestants. These two people later became the founders of Google's self-driving department.

By 2014, Google released Firefly, the world's first fully autonomous car without a steering wheel or accelerator pedal. It not only shocked the automotive industry, but also made the world realize the possibility of autonomous driving for the first time.


(Picture: Firefly, the first generation of self-driving car developed by Google)

Soon after, large amounts of venture capital began to pour into the driverless car market: from Uber, Nvidia, Amazon, Baidu, Didi, Huawei, andBenzBMW, General,Honda, including domestic new car manufacturers, began to invest in the research and development of automatic driving. Most of the autonomous driving companies we are familiar with today were founded at that time.

Time quickly came to two years later. 2016 was an extremely important year for the autonomous driving industry - because this year, Tesla officially joined the battle.

It is no exaggeration to say thatTaking 2016 as the dividing line, the second half of the entire autonomous driving industry is a story of Tesla's counterattack from an industry follower to a leader.So personally, I think that no matter how you render this moment, it is not excessive.

Before 2016, almost all autonomous driving companies chose Google's technical solutions:

1) Positioning: assisted by high-precision maps;

2) Perception: Use lidar + visual camera to provide perception information for the vehicle;

3) Regulation and control: Make decisions and control based on rule based algorithms.

It is not difficult to find that the underlying logic of Google's solution is "armor stacking"."Something is better than nothing, more is better than less."

This school of thought believes that with the current level of technology, no single sensor can complete all the functions required for autonomous driving, or cover all corner cases. Therefore, it is necessary to use all sensors to make comprehensive autonomous driving decisions and assign different weights to different sensors. Among them, LiDAR has the highest weight, so this school of thought’s solution is also called「LiDAR Solution」; In essence, this is a hardware-based technical route.

As a technology company that started with AI, Tesla pays more attention to software. For example, Musk felt that the lidar at the time was too expensive, costing $80,000 each, so he put more energy into developing powerful visual algorithms and dedicated AI chips, trying to use cameras to build models of surrounding objects and add data to neural networks for calculations, gradually developing"Pure Visual Solution"This is a software-oriented technical path.

In Musk’s own words, “Humans and animals have eyes, not radars,” and the gears of fate began to turn.

At that time, the two technical routes each had their own advantages: although lidar has high hardware costs, it is inherently highly accurate and has depth information (to this day, many engineers still have a soft spot for lidar); and although cameras are cheap, their resolution capabilities are limited, and they can only try to compare information on one surface of an object in the two-dimensional world. The information obtained through calculation and conversion is never as accurate and intuitive as first-hand information.

This also led to Tesla building its own labeling team of thousands of people, so many people questioned Musk at the time, saying that although your hardware is cheaper, the overall cost is not as good as lidar.

In fact, it can be seen from here that the AI ​​level of autonomous driving was almost non-existent at that time (except for lidar which had some AI algorithms).It is a typical example of "how much artificial intelligence there is, how much intelligence there is".If the development of AI stops here, Tesla's pure vision solution will most likely encounter a bottleneck.

But in the end, it was Google that saved Tesla:

In 2019, Google published the famous Transformer paper (which was also the basis for later large models).

Although Transformer is a neural network model based on the attention mechanism, Tesla engineers soon discovered that Transformer can not only process language information, but also can be used to process image information.

In short, Transformer can integrate the information collected by Tesla's eight cameras into the same positioning coordinate system.This is equivalent to giving the car a "God's perspective" (i.e. BEV, "bird's eye view"),Able to see 3D information of the surrounding environment.


(Photo: BEV technology gives Tesla a "God's perspective")

At the AI ​​Day in 2021, Tesla released and successfully popularized BEV. Starting from the second half of 2022, domestic players also announced their follow-up and gradually began to realize BEV in vehicles.

(So, although many technologies are not original to Tesla, its ability to engineer advanced theories of cutting-edge exploration and put them into practical application is absolutely at the cosmic level.)

At the same time, Tesla has also begun trying to automate the labeling process as much as possible to solve the problem of high costs.

The principle is a bit like playing a guessing game: first, after the on-board camera sees an object (such as a tree), it will upload the information to the big model in the cloud, and then the big model will "guess" what the object is. If the guess is consistent with the presentation of the eight sensors, then the tree will be automatically labeled.

In this way, Tesla is freed from the limitations of human power. As long as the vehicle can continuously send back road information, it can train its own algorithm indefinitely.

So far, Tesla’s performance has been amazing enough, but just one year later, Musk made two more big moves at AI Day. The first was the introduction of space-time series, and the second was the introduction of occupied networks.

In simple terms,The former enables the car to have the ability to remember time, while the latter achieves an effect similar to "pseudo laser radar".Allowing the car to calculate its spatial position and determine whether obstacle avoidance is needed without identifying what the object is, solves the previous problem of "colliding with a large white truck".

So far, regardless of the path alone, the visual solution has achieved basically the same effect as the radar solution.BEV+Transformer successfully achieved the path convergence of autonomous driving perception technology and also formed the basic framework of today's visual perception algorithms.


(Figure: Occupancy network achieves a "pseudo lidar"-like effect)

So in the big trend,The evolutionary path of autonomous driving is actually very clear: everything is aimed at making the car behave more like a human.Allow AI to predict what will happen in the future based on what happened before.

From this perspective, the past three years have actually been the three years with the fastest development of autonomous driving technology, and AI has gradually demonstrated powerful upgrade and iteration capabilities in autonomous driving; but most ordinary people at the time did not have a deep understanding of this and thought that there had been no progress in autonomous driving.

What really made the public realize the revolutionary impact of AI on the autonomous driving industry were several things that happened in 2023:

Musk mentioned for the first time on social media that "v12 is reserved for when FSD is end-to-end AI", which made end-to-end AI appear in the public eye (NVIDIA had proposed it in 2016 but it had little influence), and countless people began to look forward to the release of v12;

For the first time in nearly a decade, CVPR awarded the best paper to a Chinese team, commending its contribution to achieving end-to-end autonomous driving;

Musk did a 45-minute live broadcast in a Model S equipped with the v12 test version, and only intervened once during the whole process. The effect can be said to be very good.

However, it should be made clear that there is currently no direct evidence to prove that the V12 demonstrated by Tesla has achieved complete end-to-end performance. It is just that the effect it demonstrated is indeed very amazing, and its level of intelligence is basically comparable to that of human experienced drivers.

In fact, if the autonomous driving system is divided into perception, planning and control according to the traditional understanding,At present, major OEMs mainly use modular end-to-end technologies with partial AI + basic rule constraints to show off their strength.

Since interfaces need to be manually defined between each module, some information will be lost in the process. Therefore, the more modules there are, the more information will be lost. In the future, major manufacturers will need to continue their efforts to unify all modules into one model.

However, from the information we have collected, according to the current development speed of technology and engineering,It is only a matter of time before autonomous driving is finally realized on an end-to-end basis. My personal relatively conservative estimate is that it will still take about 3-5 years of accumulation.

By then, the vehicle will be able to make driving decisions in a "black box" state based on the road information collected in real time, and directly output control signals such as braking and steering, realizing the ultimate form of unmanned driving similar to that in science fiction movies.


(Figure: Five stages of development of autonomous driving, source: Xiangfeng Changqing)

Venture capital opportunities brought by driverless iteration

Why is driverless driving booming at this time?In essence, the "leap forward" development of driverless cars that we have experienced is just one of the specific manifestations of the paradigm shift from model-base to learning-base under the background of this wave of AI craze, but why did it emerge first?

The reasons are very complicated, and here I will only discuss three key factors that I think are important:

First, the data is relatively abundant.

As we all know, "travel" is a high-frequency demand. The domestic driverless driving industry began to rise around 2015, and it has been almost 10 years today. During this period, the data collected by OEMs and intelligent driving companies, regardless of their quality, is at least rich enough in total, which has created the basic conditions for the subsequent realization of a closed data loop.

Second, the functional definition is relatively clear and unambiguous.

Frankly speaking, although the concept of this wave of AI is very popular, the development direction of many products is actually unclear.

For example, in the case of humanoid robots, many companies have only made a sample that can be demonstrated at an exhibition. However, when put into actual industrial scenarios, the upstream technology end is actually unclear about what problems these robots can solve and to what extent.

Downstream factories do not understand AI and often do not know what functions these robots can achieve or whether they can be combined with other technologies. It takes a long time for both parties to level their understanding.

However, this problem does not exist for driverless cars. The vehicle only needs to move forward, backward, turn, accelerate, and brake to cover the basic behaviors of all driving scenarios. It is enough for AI to do these things well. The requirements are simple and clear.

Therefore, in order to transform advanced technology into easy-to-use products, a clear functional definition and standard division is also an essential factor.

Third, the hardware foundation is relatively mature.

Whether it is sensor solutions such as lasers, cameras, and millimeter waves, or various chips that provide signal transmission and processing, after 10 years of sufficient "involution", they have basically entered the stage of high cost performance + stable supply.

Therefore, in the spiral upward process of data, hardware, and technology, autonomous driving was the first to reach the balance point between price and experience, and quickly formed new productivity.

Therefore, if we follow this logic one step further, autonomous driving is likely just the beginning, and similar stories may be repeated in more tracks in the future.

Of course, the prerequisite is that some companies will emerge in this field that can continuously collect large amounts of data at low cost (similar to companies like Didi, Baidu, and Tesla in autonomous driving), rather than relying solely on some scattered small data. This may be the basis for us to judge whether a similar turning point has occurred in a certain industry.

To put it another way, with the rise of this wave of big models, many sub-sectors that can be automated by AI but have not yet had any large companies emerge (such as low-altitude economy and industrial manufacturing-related) may have systematic investment opportunities in the future. There are potential technology companies listed on the stock market with tens of billions or even hundreds of billions of dollars, which are also worthy of the attention of investors.

What else can we invest in in autonomous driving?We have just diverged a little, now let's pull our thoughts back and talk about autonomous driving itself. Throughout history, every time the autonomous driving technology route is iterated, there is an opportunity for a group of startups to rise.

For example, the mainstream solution in the Google era was lidar + camera + high-precision map, but today high-precision maps are used less and less frequently in passenger cars, and are no longer even mentioned (think about it, it was only two years ago).

Because at that time, the cost of a map collection vehicle alone was over a million. If real-time performance was to be guaranteed as much as possible, at least hundreds of vehicles would have to be on the road at the same time every day to meet the real-time collection and update of national maps. This was something that no car company or map company could afford.

As a result, the industry opportunity finally fell on LiDAR:

First, the rise of hardware manufacturers, such as HesaiSagitarLater, a group of companies that developed algorithms around lidar emerged, such as WeRide, Xiaoma, and Vimo. Soon after, a group of entrepreneurs began to try small-scale closed-loop L4 autonomous driving applications based on radar + closed scenes. Various types of mine driverless vehicles, port driverless vehicles, hotel food delivery robots, etc. appeared, and the entire autonomous driving ecosystem began to slowly flourish.

However, Tesla once again revolutionized the industry with BEV+Transformer, proving that even without radar, 3D space judgment can be achieved. At this time, LiDAR has become like the high-precision map of the past, no longer a must-have option.

Of course, today the price of lidar has dropped to the thousand-yuan level, and it is likely to continue to fall. Therefore, in the short term, some car companies will still adopt multi-sensor fusion solutions.

However, based on the logic of cost reduction, at a time when OEMs are trying to squeeze every penny, LiDAR will inevitably be replaced by pure vision solutions in the future, and the entire industry ecosystem will inevitably change accordingly.

For example, traditional millimeter-wave radars can only obtain planar information, while the direction of the next-generation 4D millimeter-wave radar that can detect height information has already emerged. Some hints we have seen include but are not limited to: at the product level, chip companies' multi-chip cascading and SoC solutions are paving the way for high-precision, low-cost solutions; at the industry demand level, OEMs are eager for and trying domestically produced independent and controllable solutions; and high-end intelligent driving products have increased requirements for safety and redundancy, etc.

In short, driven by various demands, it is likely that some new players will emerge in this vertical category.

For example, as the cameras collect more and more information and the quality of information becomes higher and higher, there is a demand for supercomputing centers and edge-side high-computing processing chips. This includes the next-generation edge-side chips, which are not just about stacking computing power, but combining different algorithm architectures to make the two better adapted. These are new changes brought about by the hardware.

On the softer side, we will focus on low-cost, high-quality and sustainable data acquisition/production methods.

In the end-to-end approach, it is already a fact that learning-based algorithms use data as fuel to drive. For example, Tesla's V12 was disclosed to have "used 10,000 H100s and completed about 10 million video trainings."

"These videos come from 160 billion frames of video collected every day from 2 million real cars around the world that can collect data, and less than 1% of them are usable, such as some strange and unusually busy intersection data."

As the first person to try something new, Musk has already given the industry a good enough answer. So, whether to obtain high-quality data through shadow mode, simulation engine, or world model seems to be the next issue that the industry needs to explore consensus on.

However, I think this can no longer be decided simply by a company in a certain link: although the technology route is certainly important, industry positioning and business model may be the forward-thinking that Chinese companies must do in the current environment.

In short, as high-end intelligent driving is gradually upgraded and implemented, new industry opportunities will definitely emerge in perception, transmission, decision-making, execution, and interaction.

For example, I recently communicated with some founders in the automotive industry and found that in addition to their main businesses, everyone is also paying attention to some cross-domain application technologies and products. Many unexpected, cross-industry advanced technologies eventually become their potential friends and partners.

This shows that when the internal competition in the industry has reached a certain level,When it is difficult to obtain sufficient profit margins by simply expanding scale and improving the supply chain, companies must seek new breakthroughs in new technologies. The end of internal competition will ultimately depend on technological development to break through the siege.

Due to limited space, a simple table is compiled at the end of the article to try to break down the entrepreneurial opportunities in the era of AI intelligent driving. I hope it can give you a different perspective.


(Figure: Industry changes and opportunities brought by different stages of autonomous driving, data source: Xiangfeng Changqing)

(The author of this article, Xu Zhen, is an investor of Xiangfeng Evergreen. He focuses on the new energy vehicle industry chain, including core components, semiconductors, materials, etc. He has represented Xiangfeng in investing in Huashen Ruili. He graduated from Zhejiang University Zhu Kezhen College and Warwick Business School.)