2024-08-15
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
Compared with Robotaxi, ensuring the stability of the human-vehicle co-driving system is a more important issue today.
Text | Wang Hailu
There has always been a dispute between two routes in the autonomous driving industry. Autonomous driving companies represented by Google Waymo andTeslaThe car companies represented by are climbing up from the south and north slopes of the same mountain. The top of the mountain is unmanned driving, and the most attractive commercial scenario is Robotaxi (self-driving taxi).
Although both sales and stock prices show that Tesla is climbing the mountain faster, when founder Musk proposed to launch Tesla Robotaxi in October this year, it still caused a lot of controversy.
Hou Cong, co-founder and president of QINGZHOU Zhihang, is one of the skeptics. "I have never understood why people are willing to believe in Musk after so many years of talking about the Robotaxi story and it has never been realized?" Hou Cong said.
Although Tesla's FSD V12 intelligent driving system based on end-to-end technology performs well, Hou Cong believes that it is still far from being a real Robotaxi. He casually cited a situation that the system cannot handle: "What if the car is stuck in the middle of the road? Do you ask the owner to take a taxi to rescue the car?"
Urban roads are complex, with people and cars mixed together, and there are a lot of game scenarios that today's artificial intelligence cannot cope with. Waymo's fully armed Robotaxi fleet failed to get out of the demonstration area. Hou Cong believes that Tesla can never meet the requirements of Robotaxi by relying only on cameras.
The four co-founders of Qingzhou Zhihang all came from Waymo. Hou Cong joined Google right after graduation, first working on compilers, and later transferred to the self-driving car project team of the Google X lab to work on perception systems.
Yu Qian, co-founder and CEO of QINGZHOU Zhihang, was also in the perception group at the time, and was a colleague of Hou Cong in the same department. The two were also alumni of Tsinghua University.
Hou Cong majored in automation in his undergraduate studies and switched to computer science in his graduate studies. But before he got his degree, he went to Georgia Institute of Technology in the United States to study for a doctorate. He described himself as not very interested in pure academic research, but rather in technical engineering practice, so he went to Google after graduating with his doctorate.
In 2016, Google spun off its driverless car project and established Waymo, which is directly controlled by Google's parent company Alphabet. Hou Cong and Yu Qian became the first batch of Waymo engineers.
Hou Cong was responsible for performance optimization and had more contact with various departments. As a result, he met Dafang and Wang Kun, who were working on planning and simulation at Waymo at the time. In 2019, four Chinese engineers co-founded QINGZHOU Zhihang.
The autonomous driving industry was no longer the hottest at that time. China's autonomous driving company Pony.ai had been established for 3 years, and TuSimple had been established for 5 years. Autonomous driving companies that used Robotaxi or unmanned heavy trucks as landing scenarios had already taken most of the market funds. Qingzhou chose a path with fewer competitors and less funds required - unmanned minibuses.
Hou Cong said that he did not make Robotaxi at that time because he thought that Robotaxi would be too slow to land. Waymo started operating a Robotaxi fleet in Phoenix, Arizona in 2016, and it seems to be stuck in countless corner cases until today.
The low labor cost and more complicated roads in China make it even more difficult to operate Robotaxi. Hou Cong experienced Baidu's Carrot Run in June this year and felt it was average. He said bluntly: "With the current situation of Carrot Run, Waymo cannot operate in this state."
Qingzhou was developing driverless minibuses, and initially thought that this scenario would be easier to implement than Robotaxi. However, after operating pilot projects in 10 cities including Suzhou, Beijing, and Wuhan, they found that they had oversimplified the problem. It was difficult to make money from buses, and regulations were slow to be implemented.
In the early days, Qingzhou successfully obtained investment from first-tier capitals such as IDG, Lenovo Capital, Meituan Dragon Ball Capital, and Yunfeng Fund by relying on the background and technical accumulation of the founding team. However, in 2021, investors began to pay attention to the ability to generate revenue, and the financing path for L4 autonomous driving companies became increasingly difficult.
In sharp contrast, Tesla's stock price soared again in 2021. The penetration rate of new energy vehicles in China has increased rapidly, and smart driving functions have become standard for electric vehicles. In the middle of that year, Qingzhou turned around and transformed from L4 to L2+, providing smart driving solutions for electric vehicles.
In August of that year, Tesla introduced BEV (Bird's Eye View) for the first time at AI DAY.
) and the Transformer model, which combines perception information from different camera perspectives into a bird's-eye view image, making it easier for the system to understand and predict road conditions. Domestic smart electric vehicle companies followed suit and rewrote perception algorithms. Qingzhou resolutely transformed at this moment and became one of the first domestic smart driving suppliers to produce BEVs.
This system is implemented on the Horizon J5 chip. Horizon CTO Huang Chang, Hou Cong, and Yu Qian are all Tsinghua alumni, and they are fellow students at the University of Southern California. Because of this connection and recognition of technology and concepts, Horizon sought ecological partners to cooperate in 2022, and hit it off with Qingzhou.
At the end of 2022, Qingzhou launched a high-speed and urban NOA (navigation assisted driving system) test solution based on Horizon J5 chip. After that, Horizon recommended Qingzhou toIdeal Auto。
In 2021, Ideal began to use Horizon's chips to develop its own intelligent driving system. Later, it planned two system solutions, AD Max and AD Pro, based on NVIDIA OrinX and Horizon J5 chips.
In the second half of 2023, Ideal decided to concentrate its internal R&D resources on AD Max and hand over the AD Pro system to a supplier for maintenance and updates. Qingzhou seized this opportunity.
After Qingzhou took over, it worked with the Ideal team to optimize the system and pushed a system version based on the Qingzhou algorithm architecture to users in May this year.
Hou Cong finally realized what he wanted to do, engineering practice, and delivered the products he developed to users. At the same time, he also deeply felt the great responsibility.
As of May this year, Ideal AD Pro has 400,000 owners. The more people use a system and the more frequently it is used, the more problems it will expose. However, the more people use it and the more frequently it is used, the better the product is.
Compared with realizing driverless driving and making Robotaxi earlier, maintaining the stability of the assisted driving system for human-vehicle co-driving and ensuring the driving safety of hundreds of thousands of users are the challenges that Qingzhou is more willing to face today.
The following is an edited conversation between Yunjian Insight and Qingzhou Zhihang CTO Hou Cong:
Four Waymo engineers start a business
Yunjian Insight: The four of you co-founders were once engineers at Waymo. Why did you get together to start a business?
Hou Cong:At the beginning, Yu Qian (co-founder and CEO of Qingzhou) approached me, and we thought this was something that could be done. At that time, there were not that many Chinese people in Waymo, so we looked for some people who we thought had experience and strong capabilities.
At that time, I was working on performance optimization and architecture design at Waymo. I was in the same group as Yu Qian, and I had business collaboration with all three of them.
Yunjian Insight: How do you divide the work and decide who does what?
Hou Cong:The four of us have very coincidental directions. Yu Qian and I both work on perception, but I work on systems and he works on algorithms. Da Fang (Chief Scientist) works on planning, and Wang Kun (COO) works on simulation. These are four different directions.
Yu Qian and Da Fang are more like scientists. My interest is in engineering, which is to put technology into practice, and I am more interested in practice and products. So I went to the industry right after graduation.
Insight: Could you please briefly introduce your early experiences?
Hou Cong:I majored in Automation at Tsinghua University. After graduation, I went to the Computer Science Department for graduate studies. I worked on vision for more than a year, but dropped out before I finished. I went to the United States to study for a doctorate. After graduating in 2013, I went to Google.
When I first joined Google, I worked on compilers. After a year, I was loaned to the infrastructure group to work on GPUs. It was probably because I made some optimizations on the CPU at the time, which was recognized by Jeff Dean (Google's chief scientist), and later I was recommended to the GPU group.
That group later developed very well, making some compilers for TPU and developing some underlying libraries under Google Brain (Google's framework for training neural networks). XLA (Google's deep learning compiler) was developed by them.
Later, when the GPU work came to an end, Google X Labs had an opportunity to optimize the perception system of driverless cars. At that time, there was a project called "20%", which meant spending 20% of our time to help it with a project.
It was Zhu Jiajun who called me over at that time. He later founded Nuro (an autonomous driving company). I worked in the perception group for more than three years.
Yunjian Insight: What is it like to work at Waymo?
Hou Cong:It is similar to Google. It belonged to Google X before it became independent in 2017. Google mainly makes software, while Google X makes hardware. There are all kinds of strange things in the office, such as robotic arms, lathes, and many other equipment.
Working there is similar to Google. Everyone emphasizes self-motivation and cooperation, and is driven by OKR. It creates a good cultural environment. Everyone wants to do better. Someone will design the performance system so that people with strong abilities can emerge from the competition. The threshold is indeed high, and many people from Google may not be able to get in.
Yunjian Insight: Are Waymo’s engineers competitive?
Hou Cong:The more time went by, the more competitive it became. The busiest period was probably the two years after we left. They were under a lot of pressure and had to achieve certain things in San Francisco. When we were there, no one pushed you, but it was common for me to work overtime at night and at noon. Sometimes people would send emails in the middle of the night to ask questions.
Yunjian Insight: What time do you get off work?
Hou Cong:No regulations.
Yunjian Insight: Can I get off work at 3:30?
Hou Cong:Yes. I am the type of person who comes late and leaves late. I usually go there after 10 o'clock, have dinner and work for a while before leaving. Basically, I work from 10 am to 9 pm.
Yunjian Insight: Are Chinese engineers the most hardworking group of people at Waymo?
Hou Cong:Overall, they are definitely more hardworking, but Americans are also hardworking. They have created a cultural environment where everyone feels comfortable doing things. They don't need to worry about many complicated things, but just focus on doing their job well.
I think the work efficiency is quite high. Some teams will have some lazy people after a long time, but Waymo was developing rapidly at that time, so I didn't see this situation.
Yunjian Insight: It was OpenAI, not Google, that made big models popular. What do you think is the reason?
Hou Cong:There are some policies within large companies. Many times, they are cautious when making decisions and sometimes they don’t dare to take risks. I think Google is particularly worried about the risk of public opinion when doing anything, so it is sometimes more conservative.
Yunjian Insight: You had a very comfortable working environment at Waymo. What was your core driving force for leaving Waymo to start your own business in 2019?
Hou Cong:Sometimes it is not good to be too comfortable, because you will think about many life problems. You can see what the next 10 or 20 years will be like. In Silicon Valley, everyone is definitely not short of money, and you can buy a good house. So what are you pursuing? There is indeed a ceiling in your career. Because of the cultural and language environment, it is difficult for you to participate in the decision-making level of the company, which is definitely not the case in China. If you want to do big things, you are more of a participant in the United States, not a decision maker.
The car is stuck at the intersection, who will rescue it?
Yunjian Insight: When you were at Waymo, Waymo had already started the trial operation of Robotaxi. What got stuck at that time?
Hou Cong:It was stuck on some corner cases. It went to Phoenix in 2016, and I was there at the time. There were few people and cars there, the roads were wide, the weather was good, and it didn't rain heavily often. But the demand was also low. Phoenix didn't make sense, it was a test operation site, not a commercial site. So Waymo fully shifted to San Francisco in 2018.
A system that worked well in Phoenix at the time was completely broken in San Francisco. The takeover rate was extremely high and it simply couldn't handle it. At the time, there was still a ceiling in technology, which of course had something to do with data distribution. We didn't collect a lot of data in San Francisco. San Francisco has very steep slopes, often 30 or 40 degrees. In addition, its traffic rules (are different), and there are frequent interactions between people and cars, so Waymo's problem at the time was to handle some corner cases.
Waymo is very cautious and will only open it up if it is safe. But given the current situation of CarrotRun, and the fact that I have ridden in their cars, it is impossible for Waymo to operate in this state.
Yunjian Insight: What problems did you find when you tried Baidu's Carrot Run?
Hou Cong:I went there in June. I had a problem with the feel of the car, and it felt jerky. Waymo is really better than a human driver. Braking, starting, and turning are all very comfortable, just like a courtesy car driver.
Second, the car was extremely conservative. For example, when turning around, the car was very unstable and was constantly disturbed by the surrounding cars. There was another place where a work vehicle was stuck behind it, and it kept following it slowly without turning around.
Judging from the basic sensor configuration, it is not a serious L4 solution. Waymo is fully armed, with 5 lidars, 30 cameras, and 6 millimeter-wave radars. Carrot Run certainly did not do so. This is also related to the situation in China. China's labor costs are too low, so it will be a long way to go for it to achieve commercial returns.
Domestic companies may be forced to adopt low-cost solutions. However, low cost will lead to a lower technical ceiling than others, making it more difficult to solve corner cases.
China is facing a L4 dilemma. People are cheap and the environment is complex. The predictability of traffic participants is poor, and there are a lot of gaming fields. Road design is not so standardized.
Construction work in China is not so standardized. When a Meituan car fell into a pothole, was it Meituan's fault? The construction company did not put up cones, so people might see a pothole in front of them, but Meituan's car did not expect there would be a pothole. But in the United States, if someone falls into a pothole, they will sue the government or the social unit. China definitely does not have that mechanism. I think China will be a long way behind the United States in developing L4.
Yunjian Insight: What do you think of Tesla's Robotaxi?
Hou Cong:I don't agree. I think it is possible that it will launch a model, for example, without a steering wheel, or with some additional configurations compared to the current user version of the car. But its current technology stack is not oriented to L4. Once it starts operating Robotaxi, many problems will arise.
L4 is naturally related to operations, and it can also solve some technical problems through operations. Tesla does not have such an operations system.
Yunjian Insight: The scenario envisioned by Tesla is that when the car owner's car is idle, he can take an Uber.
Hou Cong:What should we do if a car is stuck in the middle of the road? Who will rescue the car? Should the owner take a taxi to rescue the car? These are all very practical questions.
Cars will definitely get stuck. If you do it enough, there will definitely be some scenarios where you are either too cautious or don’t know how to interact with people, or even cars get stuck.
For example, if two Tesla Robotaxes get stuck with each other, this problem is difficult to solve. When there is no red light at the intersection, all the cars will occupy the intersection, and finally they will get stuck with each other and cannot move at all.
I have never really understood it. The Robotaxi story has been told for so many years, but it has never been realized. Why do people still want to believe Musk? I admire Musk very much. I think he has done very well in other things. But he was a bit boastful about the Robotaxi thing and underestimated the difficulty of the matter.
Yunjian Insight: How far do you think the performance of Tesla FSD V12 is from the L4 level Robotaxi?
Hou Cong:There is a big difference. Let me give you an example in reverse. Why did Waymo design the car like that? It has several types of cameras. One is a normal camera; one is for night use, with strong night vision; there are cameras that look at LED lights; there are thermal imaging cameras; and there are blind spot cameras with an infrared flash. It has five types of cameras.
Tesla only has one type of camera. Do you think Waymo is stupid? It is not stupid. After seeing too many corner cases, it has no choice but to solve the problem from the hardware. If you insist on making breakthroughs in software, it will not be easy to use. Especially in the United States, it is very dark at night and there are no street lights in many places.
If Tesla goes on the road with its current configuration, including Carrot Run, a professional scammer, there are many ways to scam it because there are too many defects in its hardware.
Yunjian Insight: Have you experienced FSD V12 in the United States?
Hou Cong:I went there to experience it in March this year, and went again in May. It is open every day.
To be honest, I don't find the V12 amazing, I just think it's more practical. In the city, especially at intersections, the handling is significantly different from the V11. Most of it is good, but not perfect.
Tesla V11 has done very well on the highway. It is very conservative at intersections in the city, not like a human, which makes it very awkward. There are a lot of cars behind it, so it moves forward slowly. It looks for a long time and moves very slowly because it has a blind spot, which is not like a human. Also, when turning, it will suddenly turn the steering wheel, which will make you panic.
Yunjian Insight: Does Tesla’s current end-to-end technical solution also need to continuously solve corner cases?
Hou Cong:Corner cases must be solved. End-to-end is to integrate the information of several modules of the traditional algorithm and use this information to train a better result.
The upper limit of end-to-end is very high, and the lower limit is relatively low at the beginning because of its poor controllability. In the past, the system was divided into modules, and each module defined an interface. If we follow the standards, there will be no strange things. Basically, the rules can solve a batch of problems at one time.
There is no such concept in end-to-end. We can only use more diverse data to solve the lower limit problem. After seeing enough such situations, we will eventually know how to solve them.
Yunjian Insight: The end-to-end model is equivalent to a black box. If a problem occurs, how can we discover and solve it?
Hou Cong:It is definitely not a so-called black box. If you work backwards from its output, the results of perception and planning are all there, so it must have made full use of previous experience to form a new model. This model has a certain degree of controllability, for example, this part is perception, that part is prediction, and that part is planning, and different stages must be distinguished.
Yunjian Insight: When do you estimate FSD will be launched in China?
Hou Cong:Next year. It should have been almost a year. I don't know the details, but it has started to recruit algorithm engineers, which means it has started. Didn't it say it would spend $10 billion in China? To build various facilities.
Yunjian Insight: So what needs to be done for FSD to be implemented in China?
Hou Cong:Build a data center and adapt its algorithm in China. After all, the domestic scenarios are complicated. We especially hope to see FSD come to China because it is a benchmark. If it invests more, it should be the fastest.
Yunjian Insight: Do domestic car companies have any response strategies?
Hou Cong:I can only chase after it. I think that after all, domestic players have some advantages in installing laser radar. Adding laser radar will definitely make this problem simpler.
Insight: Is it possible that Tesla's Robotaxi will not use the current sensors, but will be equipped with lidar or more high-definition cameras?
Hou Cong:Yes. That would certainly reduce the difficulty, but it might not be consistent with its concept.
Tesla can only hope for the continuous development of AI (artificial intelligence). I think this possibility exists, but when people drive, they have to solve several problems. You are interacting with the real world, and it does not test driving ability, but some instincts.
Second, when driving, people will check the car's environment and judge the possibility through their understanding of the surrounding environment. For example, if you see a dog and it disappears in front of you, you may get out of the car to check if the dog is in front of you. In a scene like this, people's perception is expanded in an instant. If the car is not designed in this way, it is actually quite dangerous.
Yunjian Insight: Do you think Tesla can make Robotaxi if it relies solely on software capabilities?
Hou Cong:This is very challenging, and the requirements for AI are very high. I think it will definitely be achieved in the long run, but the process is not as imagined. AI has existed for decades and has been shown in a lot of science fiction movies, but it has not been truly solved until now. Sometimes it is easy to oversimplify it.
Yunjian Insight: Tesla plans to expand the Dojo computing cluster to 100 EFLOPS (computing power units) in October this year. The gap between other car companies and Tesla is an order of magnitude. What do you think of the gap?
Hou Cong:This is a heavy investment. Tesla's profit margin was very high before. It sold so many cars around the world that it could afford the investment, and its stock price also had a relatively large support. Domestic car manufacturers are so competitive that it is impossible to invest so much resources. This is like why the big model was made in the United States first, but not in China? It requires a lot of investment and it takes many years to burn money. The investment logic behind it is different.
Cloud Insight: Are you currently reserving computing power for end-to-end?
Hou Cong:We are also doing this, but we are taking a different path. If we follow Tesla's approach and charge tens of thousands of yuan for the A100, we obviously cannot afford it. Even a few thousand yuan is a huge burden. We will make some restrictions based on our products, so that the investment will be much less.
Yunjian Insight: If investors give you enough money, which path will you choose?
Hou Cong:If we have enough money, we will go in this direction. But there is no if. Everyone is clear about the current situation. This is how the market works.
From driverless minibuses to assisted driving
Yunjian Insight: When you started, autonomous driving was not that popular in the capital market. Did you encounter any challenges in your financing?
Hou Cong:It is definitely not as easy as in the early days. Early financing was several hundred million yuan, and when it came to us, it was much more difficult. However, I think this is a good thing for us to transform into L2. If you raise a lot of money in the early days, it is difficult to transform into L2, and your valuation is here.
Yunjian Insight: At that time, many people had already started businesses in autonomous driving. Why did you think there was still an opportunity?
Hou Cong:Technically, we think we are advanced to a certain extent. We have a better understanding of L4 than others. Second, L4 has many scenarios, not just Robotaxi and heavy trucks. At that time, if you want to raise a lot of money, you can either make heavy trucks or Robotaxi. But we think there are still opportunities in other directions. This kind of medium and low-speed vehicle, whether it is a minibus or logistics distribution, work vehicle, sanitation, vending car, and some closed scene applications, such as mine terminals or some factories, although the market is relatively small, we believe that its commercialization is faster than that (Robotaxi) and the technical requirements will be lower.
Yunjian Insight: Why did you start a business by making driverless minibuses?
Hou Cong:Because it is easy. The scenario envisioned at that time was a micro-circulation, or medium and low-speed vehicles in a relatively fixed scene, not just minibuses, but also logistics and work vehicles.
Yunjian Insight: Autonomous minibuses have not been commercialized yet. What do you think is the main reason?
Hou Cong:It’s still a question of business model and who pays for it.
Yunjian Insight: In 2021, you transformed into assisted driving for passenger cars. What prompted you to make this decision?
Hou Cong:I still think that L4 commercialization is slow, with large investment, slow path and long cycle. If we continue on this path, we may encounter some difficulties in funding. We think we should find some solutions that can be commercialized quickly so that we can keep moving forward. L2++ is suitable for this market.
In 2021, new energy has developed very rapidly. Tesla's stock price has soared again. The penetration rate of new energy vehicles in China has continued to increase, and the demand for intelligent driving has continued to increase. It just so happens that our technology is relatively easy to use in assisted driving functions, and we can commercialize it in the short term.
Yunjian Insight: Why didn’t you think about doing L2 on the first day of starting your business?
Hou Cong:I didn’t expect L2 to be so fast. Tesla hadn’t launched a strong solution at that time. Hardware 3.0 had just been launched. In 2019, Wang Kun, a member of our founding team, bought a Tesla. I was shocked (after experiencing it). To be honest, I had some prejudice against visual solutions after coming out of Waymo. I thought they were unreliable because Waymo mainly used lidar.
In fact, the technology we use to make minibuses is completely based on Robotaxi. I know that many of our competitors make minibuses or low-speed logistics, which are completely different from Robotaxi. So we quickly transformed into L2. Some companies do not have this capability, but insist on going in this direction, which has big problems in the entire architecture.
Qingzhou Zhihang's unmanned minibus
Yunjian Insight: When you turned to assisted driving for passenger cars, what was the landscape of the industry like?
Hou Cong:We transformed after Tesla's first AI Day, so we definitely wanted to go in the BEV direction. We set a few rules for ourselves. First, we would not do low-end L2, which is a red ocean with fierce competition. In the end, the competition is still about cost, which is completely incompatible with our technology stack.
When we were transforming, we also discussed whether we should redo the planner. The conclusion was to maintain our advantages and not redo it, because looking to the future, we will definitely be working on mid- to high-end solutions, and our computing power is guaranteed to a certain extent.
Yunjian Insight: What adjustments did you make to your algorithms when moving from L4 to L2?
Hou Cong:The change in perception will be bigger, from laser point cloud-based to vision-based solutions. Fortunately, we used BEV as soon as possible, so we are the first company in the country to make BEV on J5.
We developed both highway and urban NOA. We believe that there is no difference in the perception system. Highway and urban are integrated, but the configuration is different. For example, the perception range and computing power will be simplified, but the architecture and algorithm are the same.
Yunjian Insight: How did you come to collaborate with Horizon Robotics?
Hou Cong:We have some connections. Yu Qian, their CTO Huang Chang and I all graduated from Tsinghua University. I worked in the Visual Lab at Tsinghua University for a year and a half, and was in the same lab with Huang Chang. Yu Qian and Huang Chang both obtained their PhDs in Southern California, and we are brothers in the same school.
At the beginning of 2022, Horizon hoped to cultivate some ecological partners, so we decided to work together.
Yunjian Insight: I heard that Yu Kai pushed you to Ideal?
Hou Cong:Yes, Brother Kai is very supportive of us, and we are also very supportive of him, helping them to establish the benchmark.
Yunjian Insight: Horizon J5 has the problem of insufficient CPU computing power when used for BEV Transformer. How did you use J5 to achieve this?
Hou Cong:Don’t use Transformer. BEV Transformer was hyped up by Tesla, but there are many ways in academia to achieve the function of mapping image features to BEV space. Transformer is just one method. We use a method that is more suitable for J5 and the effect is also very good.
Yunjian Insight: Will the next generation Horizon J6 switch to the BEV Transformer solution?
Hou Cong:In terms of cost performance, if Transformer consumes a lot of computing power but produces little output, we will consider it comprehensively. But we will definitely use Transformer to implement some things, including end-to-end, non-map services, and some map processing. Transformer has great value in this area.
The more people use it, the greater the challenge
Insight: You are currently working on the Ideal AD Pro system. Are there any challenges?
Hou Cong:The biggest difficulty is that it has a large user base and high requirements. The more people use it, the more problems it will expose under certain probability. When you only have a few thousand or tens of thousands of users, it may happen once a month, or once every few months. But if hundreds of thousands of people use it, it may happen once a week or every few days. This is also related to your user coverage. The better the product, the more people use it, and the more problems are exposed. It is definitely challenging to ensure that the system does not have any stability or security problems.
Yunjian Insight: What is a more difficult problem that you have solved recently?
Hou Cong:Identifying traffic lights is actually a difficult problem. There are so many intersections across the country, and the appearance, rules, and relationship between traffic lights and roads are all different. Some lights are simple red, yellow, and green lights, some are in two rows, there are long lights with arrows inside, there are lights for bicycles and buses inside, some lights are strange in shape, and some lights are temporary.
We also need to consider the background lights. For example, at night, due to the limitations of the sensor, it may be difficult to distinguish the shapes of arrows and circles in overexposure. When there are many lights in the background, it will have a greater impact on the system.
Also, the LED lights are always flickering, which will interfere with recognition. Cameras now have anti-strobe functions, but even so, the frequencies of some lights may not match.
Yunjian Insight: Other systems may also encounter the problem of identifying traffic lights.
Hou Cong:The same. The principles are similar, but because the hardware is different, the methods will also be different. The stronger your hardware capabilities, the more methods you can use. The more important point is how many different situations the data covers across the country.
Yunjian Insight: What is the capability limit of the AD Pro version based on Horizon J5?
Hou Cong:It does best on the highway.
Yunjian Insight: Yu Kai mentioned before that the CPU computing power of the J5 chip is insufficient because of insufficient resource investment at that time. But NVIDIA always invests ahead of time.
Hou Cong:Everyone has different ideas. Nvidia is moving towards the high-end, but its costs are there and it can't continue. It is more suitable for the self-developed model. The ecosystem is relatively complete and the development cost is relatively low. But if you want to make a cost-effective car model, you still have to choose a chip with a higher cost-effectiveness.
Yunjian Insight: Do you only use Horizon’s chips?
Hou Cong:The goal of our system design is to be able to adapt to multiple chips. In order to achieve this goal, we have made some sacrifices. We may not be able to use each chip to its full potential, but it is good enough. It may be better with further investment, but the marginal benefit will be very low. Horizon is our most important platform and partner now, and we will definitely build advantages on it. But our system design is not designed only for Horizon.
We have always used NVIDIA for our minibuses. When we were working on L4 in the early days, we used NVIDIA Xavier for sensor access and also ran some algorithms. In 2022, NVIDIA Orin came out, and we were one of the earliest users. At that time, we worked with a manufacturer and migrated the system from industrial computers to NVIDIA Orin within a month. This partner is an internationally renowned company, and it reported this to NVIDIA, and NVIDIA highlighted it at the GTC conference that year. We worked with two car companies (XiaopengandZeekr) and were the first to use Nvidia's dual Orin chips.
Yunjian Insight: What do you think of NVIDIA’s transition from making chips to developing autonomous driving solutions?
Hou Cong:It has to build a model project to help it see the future of chips more accurately. It has to know how much computing power is needed and what functions the system needs.
In fact, it published an end-to-end paper six or seven years ago, but it has never been implemented. Wu Xinzhou did it in the past, but not in a comprehensive way. I think it is still to create a model. If it does well in this area and can be made into a standardized solution, it can move forward one step.
After building the prototype project, we can also help it sell chips.
Insight: Does it take a lot of manpower to deliver each model? Or is it a universal solution that can be used for all cars?
Hou Cong:It is actually very important to see how the system recognizes. For example, Huawei certainly hopes to be a super supplier to reduce delivery costs. Huawei's sensors are all defined by itself, but generally car manufacturers will replace them according to their own supply chain. First of all, it will not change the camera, and even the installation position is defined in a relatively standard way. The front side and front side rear cameras of its car models are placed together in one module. In this way, the installation position is relatively fixed, and the adaptation cost is low afterwards. The approximate position of SUVs and sedans is the same, and the selection is the same, so the delivery cost must be low.
I don't want to be a Tier 1 anymore
Yunjian Insight: You just said that everything Tesla did was right, except Robotaxi, including making its own chips?
Hou Cong:Yes. It makes two chips, one for vehicle and one for offline use. In 2019, there were no chips with such high computing power for vehicle use. It was because of this chip that it was able to make BEV so early.
Nvidia's Orin chip will not be released until 2022, but Tesla has already produced BEV in 2021, leading the market by more than a year. It first used Mobileye's chips, and later Nvidia's Drive PX. Nvidia's price has not been lowered, so it can only make its own chips to reduce costs.
Yunjian Insight: So is it reasonable for car companies to follow Tesla’s example and develop their own chips?
Hou Cong:Tesla has the scale to do this, but it may not be right for other car manufacturers to follow suit.
Yunjian Insight: Car companies, chip companies, and autonomous driving solution providers are now reinventing the wheel. How do you think the division of labor in the industry will evolve in the future?
Hou Cong:Let the market prove everything. The purpose of doing this may not be so simple, there may be other purposes, such as shaping the image of a technology brand, including market value management and attracting talent.
Theoretically, in terms of supply chain security and cost, car companies have certain advantages in developing their own chips, but the overall cost must be considered. The biggest trouble in making chips is that you just made them and found a new algorithm, and your hardware is slow. Everyone rushed to make that algorithm, and your chip can't be used.
Yunjian Insight: Are you talking about Horizon J5?
Hou Cong:J5 is a case in point, but I think it’s not just J5, any chip of that generation that isn’t universal has this problem.
Yunjian Insight: Is the technical architecture stable now?
Hou Cong:No. Transformer is also being challenged now, and there are new and more efficient technologies. There will still be a convergence process.
Yunjian Insight: Is it reasonable for chip manufacturers to develop intelligent driving solutions?
Hou Cong:I don't think the advantages are obvious. Car manufacturers may want to get some help from chip manufacturers when they cooperate with them, which involves some underlying development, whether it is the driver or the underlying software, sensor signal processing, neural network reasoning, but the cost of chip manufacturers to provide such services is quite high, and it is impossible for them to serve all companies.
Unless you can make the supply chain very mature, but this takes time. It is difficult for chip startups to achieve such maturity, so they must be restrained. Otherwise, a few ecological partners will take over the entire market. However, car manufacturers also have the need for self-research and do not want to rely on ecological partners. It also takes time for ecological partners to grow, and it is impossible for them to quickly eat up the market.
At this time, chip manufacturers will definitely think that they can solve these problems by themselves.
Yunjian Insight: How much manpower do you need to complete a project?
Hou Cong:Generally speaking, the first project consumes the most manpower, and the investment decreases as time goes on.
Yunjian Insight: How many projects can you work on at the same time now?
Hou Cong:Two or three projects. The current investment in a single project is not at the optimal level. After the fifth project, basically each project only needs a few dozen people to complete it, so it is no problem to do five or six projects.
Yunjian Insight: Should autonomous driving suppliers hire more people to take on more projects, or control the size of their staff and only take on limited projects?
Hou Cong:It depends on the strategic goal. If you want to occupy the market, expand the scale, continue to raise funds, and go public, you can take on more projects in this high-profile way. Another type is to first benchmark, cultivate capabilities, standardize products, and then quickly replicate them. At the beginning, there are only one or two projects, and the benchmark is first made, but the product is standardized, so there will not be much customization in the subsequent promotion. This is another idea.
Yunjian Insight: Why did you choose the second route?
Hou Cong:The more we accept, the more losses we suffer, and the pressure on business operations is huge. We believe that there will be only a few suppliers in the market, and product standardization must be done well.
Yunjian Insight: As this industry evolves, what role will you play in the chain in the future?
Hou Cong:We can make one or two chips in at least a certain category of mid- to high-end products and provide a standardized solution, and eventually we may become a Tier 2. We make algorithms and deliver software, but we don't do the whole package. There is a system supplier that includes these things.
Yunjian Insight: Why don’t you become Tier 1?
Hou Cong:We don’t do system Tier 1, but we do software Tier 1 (system Tier 2). System Tier 1 requires a lot of hardware, which requires a lot of investment and has a relatively thin profit. We will also consider this direction, but at present, Tier 2 is more feasible.
Yunjian Insight: Is Tier 1 only responsible for being the domain controller?
Hou Cong:We also need to deliver the goods. It requires a lot of manpower, which is not our advantage.
Yunjian Insight: What are your advantages?
Hou Cong:We still focus on algorithms and products. I don’t think Qingzhou is a company that is genetically capable of doing very well in Tier 1 hardware. That is another type of company, such as Desay SV and Joyson Electronics.
Yunjian Insight: But your cooperation with Ideal is your role as Tier 1 of Horizon.
Hou Cong:We are a software Tier 1. Tier 1 will still exist for a while in the short term, but it will eventually be standardized, and then we will either make our own hardware or become a Tier 2.
Yunjian Insight: So in your opinion, in the future, the intelligent driving solution providers will be car companies, system Tier 1, algorithm Tier 2 and chip Tier 2?
Hou Cong:right.
Yunjian Insight: Can’t Tier 2 chips do what Tier 2 algorithms do?
Hou Cong:See if it's better at this than us.
Yunjian Insight: If a standardized algorithm is provided, then chip Tier 2 should also want to do it.
Hou Cong:Why should it do it itself? Just cooperate with us.