news

Dialogue with Li Yan: Su Hua, Jingwei, and Redpoint support the first generative recommendation startup

2024-07-18

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina



Humanity is experiencing an explosive growth in the field of artificial intelligence, and almost every step in the expansion of technology into the unknown has attracted an astonishing amount of attention.

As the boundaries of artificial intelligence expand, innovations and divergences in the technical routes of important tracks coexist. The judgments and choices of technology pioneers influence the footsteps of many followers.

In the past year, Synced exclusively introduced outstanding companies such as Darkside of the Moon, Biodata, Aishi Technology, and Wuwen Core to everyone, leaving them with the first "10,000-word interview transcript" in the Internet world. At a stage when the technology route has not yet converged, we have seen the leading power of AI entrepreneurs who truly have faith, courage, and systematic cognition.

Therefore, we launched the "AI Pioneers" column, hoping to continue to find and record entrepreneurs with leadership qualities in various segments of artificial intelligence in the AGI era, introduce the most outstanding and high-potential startups in the AI ​​track, and share their most cutting-edge and distinctive insights in the field of AI.

Author: Jiang Jingling

Machine Heart Report

After leaving Kuaishou to start his own business, "Li Yan" quietly received US$32 million in seed round financing from Kuaishou co-founder Su Hua, Redpoint Ventures and Matrix Partners.

As a core figure in Kuaishou's initial AI system, Li Yan built the first deep learning department within Kuaishou, and later helped Kuaishou build the Multi-Media Understanding technology system.

One of his investors concluded that among the three types of AGI startups, namely professors and scholars, mobile Internet practitioners, and academic geniuses, Yuanshi Technology is the only team capable of integrating the three core technology stacks of "multimodality, search, and recommendation."

However, since Li Yan officially confirmed his entrepreneurship in early 2023, he has almost disappeared for more than a year.

Over the past year, we have sent multiple interview invitations to Li Yan's team, hoping to talk to him about his entrepreneurial ideas, but they were all declined with the excuse that "the product is not yet ready (for public)".

Not long ago, Yuanshi Technology's product "Ask Xiaobai" was officially launched and the cold start internal test was started. This was also the first time that Li Yan's team disclosed the progress of their entrepreneurship. So we found Li Yan again, hoping to talk to him about his entrepreneurial plan.

In this exclusive interview, surprisingly, Li Yan’s choice was not a pure model company, nor did it even approach the issue from a multimodal perspective.

In the product "Ask Xiaobai", users can see AIGC content personalized by AI based on their interests in the "feed" as soon as the screen opens, and can use the "chat" function to interact with AI based on the content at any time.



It can be understood that this is a generative content community product based on the self-developed LLM model. Compared with previous content community products, Li Yan's action lies in "generative recommendation".

This is a cutting-edge technology research field, and only Meta and CMU have achieved some results so far. He told me that compared with previous recommendation algorithms, the generative recommendation algorithm will no longer be based on the collaborative filtering recommendation system, and the recommendation will become more intelligent, from the current "thousands of people with ten faces" to the true "thousands of people with one face".

By exploring deeper user needs, recommendation efficiency is further improved, and users can get information that is more suitable for them. Moreover, the infusion of a large amount of high-quality corpus into the large model gives the generative recommendation algorithm "values". It is no longer just about "pleasing" users, but guiding users to pay attention to the high-quality information that they really need to pay attention to.

Currently in China, Li Yan’s team is the first startup company to use LLM-driven generative recommendation algorithm as its product core and development direction.

An investor of Yuanshi believes thatThe cost and efficiency optimization brought by this new technology engine in the content industry is basically consistent with Toutiao's path to success.On the road to developing generative recommendation algorithm products, "we see that the only team member who has backgrounds in multimodality, search, and recommendation is Li Yan."

Vision:

Make a higher-dimensional recommendation algorithm

Synced: Let us first introduce what Yuanshi Technology hopes to do?

Li Yan:We hope to help users enter the flow state and fight against mental fatigue through technological innovation and intelligence gathering. (From Mihaly Csikszentmihalyi's "Flow" theory)

Synced: This is a bit abstract, could you explain it a little more?

Li Yan:We feel that we are living in an era of information explosion. There are many channels for receiving information, but there are few channels that can actually get the information we care about.

For example, in the recent WAIC, you may see a lot of coverage but each report is just a few words, and you cannot get the information you really care about. At this time, you will fall into a state of anxiety.

We understand this as a kind of "spiritual entropy".This concept was proposed by a psychologist, Mihaly Csikszentmihalyi, and it matches what we want to do very accurately. What we want to do is to help people improve their sense of happiness and gain after seeing information. This state is different from the "more anxious, more tired, and happier" feeling we get after heavy use of some information products.

Synced: What kind of information will make people feel happier and more fulfilled, rather than more anxious and tired?

Li Yan:There is a concept of "flow" here, which means that people will enter the flow state and feel happy only when they see the information they really want to see, instead of seeing many things that are irrelevant to them or that they are not interested in.

This is also a result of psychological research. For example, parents ask their children to do their homework. Although the homework is finally done, the child is passive and very painful. He feels happy only when he is doing what he wants to do. So we hope to help users enter the state of flow to fight against mental entropy.



Synced: In fact, the underlying recommendation systems of most social communities now essentially hope to achieve this goal (to recommend what users really want to see)?

Li Yan:There is a difference. If we look at today's recommendation system in 2034, ten years from now, including the products and the technology behind them, we will find that they are very backward. The current products have not yet achieved a perfect state.

Synced: How do you understand the current level and the “better” level?

Li Yan:I can make an analogy. The current information distribution is more like the instinctive reaction of primitive humans. In the stage when the spirit is not too rich, people’s instincts may be "I want to eat", "I want to cry", "I want to laugh", which is very direct.

This might be reflected in the recommendation system, for example, if you like handsome guys, it will always recommend handsome guys to you - the recommendation system does not think too much deeply. What our product hopes to do is not to please the user's instinctive reaction, but to recommend with higher intelligence, care and love.

Synced: This sounds like a higher aesthetic dimension, with a hint of wanting to “educate users.”

Li Yan:To be precise, it is not education. Many things cannot be seen clearly in a relatively short period of time. But if we look at the entire history of human development, we will find that every progress of human civilization is accompanied by criticism, reflection, even overthrow and reconstruction. Some things may look good at the moment, but they may be limited in the future. The same is true in the online world. We hope to bring more civilized elements and advanced ideas accumulated by mankind into content distribution.



Technical implementation path:

Choose higher quality data to train the model and make the model more valuable

Synced: You just said that you hope to create a content product that helps users better achieve flow. Why did you start with making a better LLM?

Li Yan:We believe that LLM is a very important node on the road to AGI. Large language models can better understand users and content, and know what users care about, like, and dislike. All of a user's personal interests and hobbies can be tokenized, and large models can understand them very well.

Previous recommendation systems were unable to achieve this level of understanding. They could only label the user with many discrete tags in an attempt to characterize and understand the user. Now, large models can not only better understand the user's existing interests, but also enhance the mining of user interests and infer the user's implicit interests and hobbies.

With a large model, we can compress the highest quality corpus on the entire Internet and the human civilization contained in text, so as to use this civilization and further apply these capabilities to generative recommendations. It will have its own values ​​and worldview, thus having a higher-dimensional recommendation value system.

The big model actually serves as a bridge, linking these most advanced cognitions with your information consumption, and then further improving your content consumption level.

Synced: Do these “advanced” contents refer to papers? Do they include both social sciences and natural sciences, or are they biased towards one area or the other?

Li Yan:The big model will read all the advanced civilization and information accumulated by mankind on the entire Internet, and it can be advanced in all aspects.

Synced: How does the big model determine what constitutes an “advanced civilization”?

Li Yan:In fact, we humans have already made judgments on this matter, rather than big models. For example, our authoritative paper journals and books written by well-known scholars are not defined by big models, but are high-quality information established by humans themselves over a long period of time.

Synced: Yeah, so what exactly is this high-quality data? Where does it come from?

Li Yan:We value the construction of data-driven model capabilities. In our model, we use algorithms to increase the amount of high-quality data available by more than an order of magnitude. In addition, in terms of data selection, we use more classic books, theories, and papers to train our large models, so that our models have the ability to understand users more deeply. More specifically, when it comes to content recommendations, we will not blindly let users stay in the short-term pleasure. Instead, we will simultaneously have the long-term happiness of accumulating high-quality information.



Synced: You just mentioned that generative recommendation algorithms can improve the understanding of users. Is there any quantitative standard to compare the understanding of users by different recommendation algorithms?

Li Yan:Since different companies pursue different goals, the optimization goals are also different. Generally speaking, they may be duration, click-through rate, and retention. Since our technical principles and business direction are new fields, we currently have a very complex data system internally to evaluate this matter.

Synced: What are Yuanshi’s current technical advantages in LLM?

Li Yan:From the first day of the company's establishment, which was around April 2023, the first version of the large model was based on the MoE architecture. The overall technical route selection is very forward-looking in the market. From April 2023 to now, more than a year has passed, and our model has iterated four versions. On many public test sets, our results are better than many other models.

In addition, our high-quality corpus ensures that the quality of answers is very high, and the model has the ability to think deeply. Third, the speed of our large model is also very competitive, with extremely low latency. We have made extreme optimizations in model training and reasoning, which has greatly reduced the cost of training large models. It is now free, and you don't need to pay for use during peak hours.

Synced: Why do you think MoE is the superior route?

Li Yan:We believe that to make our own products, we need to have the ability to integrate the underlying links of the model. In the era of large models, the model effect is better, which often means that it has a large number of parameters. However, as a to consumer product, if the cost of model reasoning is very high, it will not work commercially. Therefore, we need both a large number of parameters and a low reasoning cost as a prerequisite for commercial feasibility. In the end, we can only choose MoE. We thought about this problem clearly from the first day, and the first line of code we wrote was MoE.

Synced: Since Yuanshi is an application company, have you considered using some open source models during the research and development process? This might be more economical.

Li Yan:Our goal is not to be a model-level company, but we still choose to develop our own big models because we believe that other people’s models do not serve our goals. We are a product company driven by our own big models.

We have not made any attempts at a business model at the model level. This is related to my personal understanding. Some people think that big models are like water and electricity, which means that once I have built a good big model, you don’t need to build it anymore and you can just use my capabilities. But we believe that the greater significance of big models lies in the ability to provide users with the ultimate service and scenario-based capabilities. In a fixed scenario, it provides better service to users and provides an experience that was not available before.

In addition, it has been proven that the ability to make fine-tuning changes is limited. Because the innovation of what we do is quite large, we need to make major changes to the underlying model architecture. We also compare our own models with open source models. Facts have proved that the effect of our own research is much better than that of open source models. Because this model is completely built for my scenario, a lot of work has been done from the construction of training data to the design of algorithms.

Synced: You were one of the earliest people in China to explore multimodality. Do you have a timetable for this?

Li Yan:At present, the text model is still the core of the core and the foundation of intelligence.



product value:

Able to pay more attention to the personalized needs of users

Synced: Yuanshi Technology’s product form is actually different from almost all large-model C-end products on the market. Why do you want to define such a product?

Li Yan:We are not a product targeting a specific group of people, we are targeting a wide range of people, and we are not a vertical content community. We believe that with the improvement of AI generation and distribution capabilities, the boundaries of content verticals in the future AI era will become increasingly blurred.

At the product level, our product currently has two functions, one is Feed and the other is Chat. We call it "Ask Xiaobai". On the one hand, users can ask it any questions in their lives. On the other hand, Xiaobai "asks". Based on the questions users ask AI, Xiaobai will also take the initiative to care about users and push information to users. The name Xiaobai is to give users a sense of security and intimacy, and to abandon cold or violent AI, and to get close to users.

Synced: So can it be understood as a content product with AI capabilities?

Li Yan:Yes, in addition to this, it is also a real-time online friend that understands your preferences. As a user, you can assign it to do something when you have something to do, and it can observe you and proactively do something to help you if it has nothing to do.

Synced: Are all the content in the feed from AIGC? How do you ensure the quality of this content?

Li Yan:When using a big model to produce content, it first needs to know what kind of content users like, and then generate and organize high-quality article content based on these topics. These two aspects are the ability to understand on the one hand and the ability to generate on the other. At present, there is still a lot of room for improvement in the big model in these two aspects. This is also the reason why we started a business, because we believe that we have the ability to greatly improve this matter.



Synced: Your product looks a bit like the AI ​​version of Zhihu, Xiaohongshu, and Toutiao. Compared to these, what are the differences and advantages?

Li Yan:First of all, we pay more attention to the personalized needs of users. The most basic principle of the recommendation system of all the products you just mentioned is collaborative filtering, that is, if a user likes A and B, and another user likes A and C, then B and C are similar. Then we recommend B and C to you respectively. This collaborative filtering method has a very obvious problem, that is, it always recommends some head vertical categories to you.

Why? Because if you like any topic, you are likely to like beautiful women and handsome men, and entertainment, just like other people who like this topic. So the system will eventually determine that you actually like entertainment and handsome men and beautiful women.

This method has its advantages, and can quickly drive the continuous growth of user time. But its problem is that it makes the user's personal interests and niche interests buried, and it is difficult to understand the user in detail.

We do this based on a big model. We first hope to take care of your personalized interests, rather than just pushing you top handsome men and beautiful women or entertainment content. In this case, this recommendation system is not a truly personalized recommendation system.

Therefore, a sufficiently intelligent recommendation system should theoretically be able to take into account all the interests of users, whether they are general interests or niche interests. However, this is far from enough.

Synced: Why did you choose this direction when starting your business, instead of the currently common chatbot or emotional companion product forms?

Li Yan:We believe that LLM, combined with recommendations, has the opportunity to define a new type of interaction, a smooth experience that does not require "active" interaction. Currently, pure chat-type interactions still have a certain usage threshold for users, requiring users to actively ask questions. To some extent, this also limits the penetration and use of a wider range of users. As for all the recommendation products we use today, although users use them a lot, we still see users repeatedly uninstalling them. Repeated installation means that he cannot do without them, but his repeated uninstallation means that he is not 100% satisfied. This is a point that makes us believe that recommendation products actually still have great opportunities.

On this basis, we believe that our team's background is very suitable for doing this. I personally and my team have extensive experience in search, AI research and large-scale product implementation.



Synced: However, content-based products currently generally face the dilemma of unclear commercialization paths and not being very successful. What do you think in this regard?

Li Yan:We are still at the stage of fully demonstrating our user value. It is meaningful to talk about commercial value based on the great user value. The strong monetization ability of large-scale content products has been successfully demonstrated by many products, such as Kuaishou.

Synced: Coming back to the product, what is the value of better answering capabilities to the product?

Li Yan:I think there are two reasons. The first is that the better your answers are, the higher the user stickiness will be. In this way, you can get more user signals and understand the user better. Ultimately, the system can create content that the user likes and really needs based on these, and continuously form a positive experience and data cycle.

Synced: If we think optimistically, what impact might the gradual maturity of generative recommendation algorithms have on the content industry? In your imagination, what would a mature “Ask Xiaobai” look like?

Li Yan:Generative recommendations inject new vitality into the content track, making it possible for this sector to undergo huge changes, rather than painstaking improvements.

At present, large models and other related technologies are advancing by leaps and bounds, but the bottleneck of communication between humans and AI is obvious. We have the ability to do better in both aspects. Ask Xiaobai, Xiaobai asks, we hope to greatly promote the popularization of AI technology and let ordinary users who need AI more feel the power of AI.