What is the dispute over open and closed source of large models?
2024-08-14
한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina
The debate cannot negate the market value of each other, and the two market demands will coexist for a long time
Text|Wu Junyu and Xu Wenpu
Since the beginning of this year, entrepreneurs, investors, and start-ups in the AI (artificial intelligence) industry in China and the United States have simultaneously sparked a debate: should large models be open source or closed source?In China, the focus of the debate is Baidu founder Robin Li. In April this year, he publicly stated, "People used to think that open source was cheap, but in fact, in large-scale model scenarios, open source is the most expensive. The open source model will become more and more backward." This view is not without opposition. Opponents include Alibaba Cloud CTO (Chief Technology Officer) Zhou Jingren,Baichuan IntelligenceCEO Wang Xiaochuan and Cheetah Mobile CEO Fu Sheng. In May this year, Zhou Jingren said in a media interview, "Open source's contribution to global technology and ecology is unquestionable. This has been proven many times around the world and there is no need to discuss it anymore."In the United States, the debate is even more intense. Tesla founder Musk once sued an AI startupOpenAIMusk was one of the main founders and investors of OpenAI in 2015. He believes that OpenAI, led by current CEO Altman, has violated its promise to "operate as a non-profit organization and make AI open source." Two well-known Silicon Valley investors, Andreessen, founder of a16z, and Khosla, founder of Kleiner Perkins Caufield & Byers, have clashed on social media for many rounds. The former believes that closed-source models will lead to monopoly by giants and undermine academic research. The latter believes that large models are economic weapons and should not be open source.
Open source is a software development model in which the source code is released for free and survives on community donations. Developers can freely download, modify, and distribute the software, report software bugs (software defects or errors), and make optimization suggestions. This collective innovation will accelerate software iteration.An open source model refers to a model that can be used for free and has published technical details such as model parameters; a closed source model refers to a model that requires payment and has not published technical details.To put it simply, open source is almost equivalent to free, but you have to buy groceries and cook for yourself; closed source is almost equivalent to paid, which is equivalent to going to a restaurant to eat and getting better service.Should big models be open source or closed source? There are so many commercial interests and technical opinions that many facts are confused - but there are several definite facts behind this debate.First, different business strategies lead companies to choose different technical routes.Baidu, OpenAI and other companies that hope to quickly commercialize large model businesses have chosen closed source; Alibaba Cloud, Meta and other companies rely oncloud computingOr companies that make profits from advertising business can choose open source to make the pie bigger.Second, the market demands for open source and closed source will coexist for a long time, and it is impossible to simply judge which is better.Open source and closed source models have their own applicable scenarios, and the choice of model is related to market demand. This will not change at the will of the model manufacturer.Third, there are essential differences between open source models and open source software.Open source software publishes source code and most technical details. Open source models are more like a free technical black box - model parameters are open, but technical details such as source code, training data, and training process are rarely open.In addition, the dispute over open and closed source in China's AI industry is more of a commercial competition.Open source has no borders, and this concept has been widely recognized. However, as the competition between China and the United States in the AI industry intensifies, the voices of the American industry against open source are getting louder and louder.
Who is open source and who is closed source?The development of large models is still in its early stages and still requires exploration and trial and error.Open source and closed source are not clearly defined. When faced with the choice between open source and closed source, companies have taken three different paths.The most extreme is to only do open source models. Few companies take this path, and Meta is one of the few. The advantage is that it will attract more users, but the problem is that there is no profit model, and only large companies can afford it.Meta's Llama 3 is the open source model with the most users in the world. Meta's main business is social media (such as Facebook and Instagram), with a net profit of up to $39 billion in 2023. Meta has the urge to explore new businesses, but does not have the pressure to make profits from models. Therefore, it can only make open source models and not consider profitability for the time being.A middle route is to use both open source and closed source, which is very flexible. Enterprises can acquire users through open source and earn revenue through closed source. It gives developers room for choice and also allows enterprises to make mistakes.Companies that have chosen this path include Microsoft, Google, Alibaba Cloud, Tencent Cloud, and AI startups such as Mistral Al, Zhipu AI, and Baichuan Intelligence. The common practice of open source and closed source in parallel is to attract users with free open source models and guide users to use closed source models with larger size and stronger performance. For example, Microsoft's main commercial model is OpenAI's GPT-4 series, but it has also open sourced the small model Phi-3 Mini; Alibaba Cloud has open sourced more than a dozen models with 500 million to 110 billion parameters, and also provides closed source basic large models and industry models; Google has open sourced the Gemma series of small models, and also provides closed source Gemini series of basic large models; Mistral Al and other startups have open sourced the previous generation of models with lagging performance, guiding users to pay to use the current generation of models with stronger performance.The problem with open source and closed source models is that commercialization sometimes fights with each other. Some customers will not use the paid closed source model after using the free open source model, and the model manufacturer will lose some of its income.A technical person from a Chinese AI software service provider told Caijing in July this year that they recently used Alibaba Cloud's Tongyi Qianwen open source model (Qwen2) for secondary training and fine-tuning to serve a local city tourism bureau. The order exceeded 10 million yuan, and they were the beneficiaries, but Alibaba Cloud did not receive any income. Caijing checked the license agreement of Qwen2 on Github (the world's largest code hosting platform). The agreement shows that "no commercial use request is required." In other words, Qwen2 does not need to be paid for commercial use after being trained and fine-tuned.The long-term value of open source is to expand the model market pie. An Alibaba Cloud person told Caixin that it is normal for users to modify open source models for commercial use, and you must be prepared for this when doing open source. Although Alibaba Cloud has not eaten all the cakes for the time being, it has expanded the industry cake. In the long run, it will eventually benefit. A chemical reaction will only occur when large models are widely used by different customers such as governments, large, medium and small enterprises, and developers. The large model industry must establish an ecosystem to form a growth flywheel. This trend can be seen in ModelScope, an AI open source community under Alibaba Cloud. As of July this year, the Magic Scope community has more than 5.6 million developers, more than 5,500 high-quality models and thousands of data sets, making it the largest open source model community in China.A more optimistic view is that open source and closed source can even become upstream and downstream relationships. Open source is at the upstream of technology, responsible for community participation, technology iteration, attracting customers, and ensuring that technology is ahead of peers. Closed source is at the downstream, responsible for commercial realization.Lanzhou Technology is a Chinese large-scale model startup. Li Jingmei, partner and co-CEO of Lanzhou Technology, told Caijing that open source is both a technical strategy and a business strategy. It can influence the developer community and the minds of potential customers' technical teams. Open source and closed source are not contradictory. The customer feedback cycle of the closed source model is relatively long, but the community developers of the open source model will give feedback quickly. This can help the company quickly iterate its products.An AI strategic planner at a leading Chinese technology company believes that for leading cloud vendors such as Alibaba Cloud, it is better to run both open source and closed source products than to only focus on closed source. Alibaba Cloud's revenue mainly comes from the four major components of the public cloud (computing, storage, network, and database). The free open source model will promote customer business data consumption, thereby driving sales of the above basic cloud products.Only closed-source models are a simple, direct and logical approach. Large companies that take this approach believe that large models must be closed-source if they are to be commercialized, otherwise the commercial closed loop cannot be achieved.AI startup OpenAI (its GPT-4 series model), Amazon (invested in AI startup Anthropic, which includes the Claude 3.5 series model), Huawei (Pangu large model), Baidu (Wenxin large model) and other companies have chosen this path. Companies using large models usually pay according to the number of API (application programming interface) calls, which is like paying for water, electricity and coal according to usage. The closed-source model business model is theoretically the healthiest. Microsoft Azure, Amazon AWS, and Google Cloud have all increased their revenue growth by about 5 percentage points in the past year, and their profit levels have also increased slightly. This is believed to be the result of the pull of large models.However, in China, closed-source models are unlikely to be truly profitable in the short term. In May this year, a price war began in the Chinese model market. The purpose of price cuts is to stimulate customer demand and expand the market size. Bytedance's cloud services Volcano Engine, Alibaba Cloud, Tencent Cloud, and Baidu Smart Cloud have successively reduced the price of large model calls by more than 90%. The gross profit margin of large model calls has dropped from more than 60% to less than 0%.A person in charge of the big model business of a Chinese cloud vendor believes that big model calls have entered the "negative gross profit era". The more times they are used, the greater the loss. The difference is that big companies such as Alibaba, ByteDance, and Baidu can afford the loss, while small and medium-sized enterprises and startups cannot.He expressed a similar view to an executive of a large model startup company - different companies have different genes and different model business strategies. Cloud is the core business of Alibaba Cloud, and the ultimate goal of open source models is to sell more clouds. Volcano Engine is backed by ByteDance, and the parent company's advertising business can provide blood transfusions. Volcano Engine's market share in cloud computing is far lower than that of Alibaba Cloud. "Barefoot is not afraid of those wearing shoes", and it hopes to seize more market share through price wars. AI is Baidu's core business. Baidu hopes to make profits from large models, so it emphasizes the value of closed-source models.
What is the debate? What is the consensus?There are several focuses in the debate over open and closed source models in China. First, is there a difference between open source models and open source software? Second, which is stronger, the open source model or the closed source model? Third, which is more expensive, the open source model or the closed source model?The first debate is, is there a difference between open source models and open source software? The answer is, there is a big difference. Most open source models are not completely open source. They are more like black boxes that can be used for free, rather than transparent boxes like open source software.Open source software will publish its source code, and developers can master most of the technical details of the software through the source code. The core logic of open source software is that developers from all over the society can help software manufacturers find product bugs and make optimization suggestions. Social development can not only reduce the cost of software development, but also speed up the iteration of software. The mobile operating system Android and the database software MySQL have achieved success in this way.The complexity of open source models far exceeds that of open source software. Open source projects include source code, parameter weights, model structure, training data, training process, etc. In March this year, two scholars from Radboud University in the Netherlands, Lisenfeld and Dingemans, published a paper comparing the degree of open source of open source models. The paper shows that the most powerful open source models usually only open source parameter weights. One explanation is that in order to ensure the leading performance of the model, the model manufacturer cannot reveal the "recipe" in its entirety. Taking Llama3, the world's most powerful open source model, as an example, it only partially open sourced parameter weights and model structure, and the source code, training data, and training process are not open source.
The value of the open source concept to the industrial ecosystem is unquestionable.Xin Zhou, general manager of Baidu Smart Cloud AI and Big Model Platform, told Caixin in July this year that open source models will enrich model applications and industry models. But he opposed confusing open source models with open source software.Because there is an essential difference between the two - open source models cannot rely on social developers to participate in improving product performance and reducing R&D costs like open source software. The base model can only be improved by the model manufacturer's own training. The fine-tuning and reasoning optimization of open source models are not as good as commercial models. They have high technical requirements for developers and the actual cost of use is not low.The second debate is, which one is better, the open source model or the closed source model? The fact is that the closed source model is usually more powerful than the open source model, but the performance gap between the open source model and the closed source model is narrowing.
The Center for Foundational Models (CRFM) at Stanford University has been conducting global large-scale model test rankings for a long time. The massive multi-task language understanding (MMLU) test rankings released as of July 24 show that only Llama3.1 is an open source model among the top ten performance models, while Claude3.5 (invested by Amazon), GPT-4o (invested by Microsoft), Gemini1.5 Pro (developed by Google) and others are all closed source models.Li Jingmei believes that the closed-source model of the same company must have better performance than the open-source model. However, in a horizontal comparison of the industry, the closed-source model is not necessarily better than the open-source model. Because the large model is iterated every 6-12 months, some open-source models may evolve faster.
The rankings of evaluation organizations show this trend. LMSYS (Large Model System Research Organization) was initiated by the University of California, Berkeley. The organization also conducts long-term evaluation and ranking of global model performance. Meta's Llama3.1 and Alibaba Cloud's Qwen2 are rapidly improving in the rankings of the evaluation. Llama3.1 even surpasses most closed-source models.A Chinese cloud vendor's big model business manager analyzed that there are two reasons for the narrowing of the performance gap between open source and closed source models: in the past year, basic big models have generally entered a bottleneck period of performance improvement. Open source models have attracted a large number of developers. Although they cannot directly improve model performance through code feedback, they have improved the overall level of model research, which indirectly helps open source models improve model performance.The third debate is, which is more expensive, the open source model or the closed source model? The conclusion is that performance is the determining factor. The cost of using a model is directly related to the model performance. The stronger the performance, the lower the long-term cost of use, because fewer calls are required to complete the task.Open source models are free, and usually give people the impression of being cheap and low-cost. Xin Zhou explained that large model applications are a comprehensive solution that includes "technology + services", and companies need to calculate the "general account". In addition to providing complete models and tool chains, closed source model manufacturers also provide training and technical services to help companies get started quickly. Open source models seem free, but to achieve the same effect as closed source, a lot of manpower, funds, and time are required in the future, and the overall cost is higher.In the long run, the decisive factor in the application cost of open source and closed source models is the inference cost. Closed source models of the same parameter magnitude usually perform better than open source models and have lower overall costs. Xin Zhou did some calculations and found that if an enterprise deploys an open source model for free, it would cost 500,000 yuan to deploy a closed source model. In the initial investment stage, open source models are cheaper. In the later use stage, if the closed source model has 20% better overall performance than the open source model, the closed source model can save tens of thousands of yuan a day in some companies with large usage. In the end, the long-term use cost must be much lower than that of the open source model.
Who is using the open source model? Who is using the closed source model?Is the open source model better or the closed source model better? This question is not decided by the model manufacturers on the supply side, but by the corporate customers on the demand side.In public, companies are constantly engaged in verbal battles. However, many cloud vendors’ technical experts told Caixin that these debates cannot negate each other’s market value. These two demands will coexist for a long time. From another perspective, verbal battles can make it easier for both parties to gain a bigger market share.In fact, most corporate customers do not care whether the model is open source. Xin Zhou concluded that after communicating with many large corporate customers, he found that there are many factors that determine whether the head of the IT department uses a model, and the priority ranking is usually: effect, performance, price, and security. Open source or closed source is not a decisive factor.In the "toolbox" of most companies, open source models and closed source models are complementary. Large companies usually implement large models in different stages.In the early stage, the IT department will sort out the performance and characteristics of open source models and closed source models on the market. Different models have different advantages, some have strong language and voice capabilities, and some have strong data statistics capabilities. In the early stage, free open source model POC (proof of concept) testing is carried out to verify the business effect.In the medium term, we will first carry out a phase of projects in business scenarios with low difficulty and quick results, such as marketing, customer service, and knowledge base. We will not only purchase closed-source models, but also train and fine-tune a set of our own open-source models. Let the internal and external models "race" to compare the effects and costs of different models, and switch the usage at any time.In the later stage, according to the implementation effect, the second and third phases of the project are planned step by step in the business scenarios with high difficulty and slow results. At this time, it often costs tens of millions of yuan to build a set of independent and controllable basic large models or industry large models.The open source model is free, but it cannot be used out of the box. It takes time to develop, and no one is responsible for the bottom line. The closed source model can directly get mature products, and there is a full range of services before, during, and after sales. Simply put, the open source model is like buying groceries and cooking by yourself, while the closed source model is like paying to eat in a restaurant.Xin Zhou's view is that the open source model is suitable for academic research, for some small and medium-sized enterprises with extremely limited IT budgets, and for some large enterprises for self-controlled internal research projects, but not for large-scale external commercial projects. In some serious commercial projects with millions or tens of millions of yuan, the closed source model is still the best choice.The open source model is not a free lunch. There are many hidden costs for large enterprises to use open source models. For example, purchasing computing power, software adaptation, etc. A technical director of a Chinese overseas intelligent marketing service provider told Caixin in July this year that his company relies heavily on cloud services, with annual research and development expenditures exceeding 80 million yuan. In the past two years, the company has been using more than a dozen closed-source models at the same time, but there are no open-source models. In his opinion, open-source models require time and manpower to tinker with. Most open-source models cannot be used out of the box, and no one covers them, so they can only be regarded as "toys." He tends to manage more than a dozen closed-source models and switch them at any time according to price and performance. This is the most cost-effective.The head of IT at a large joint-stock commercial bank believes that it is not a big problem that the open source model cannot be used out of the box. He told Caixin in December 2023 that his team also used Alibaba (Tongyi open source model), Meta (Llama open source model), Baidu (Wenxin series), and Zhipu (GLM series) for self-developed compliance report audit applications. The open source model is suitable for such small projects, which can be tested for free POC and modified as needed. His IT team has dozens of people, as well as outsourced IT service companies, and they are enough to deal with these problems. But he also believes that closed source models are more suitable for large projects of millions or tens of millions of yuan. Because closed source models are stable and reliable, you can also find a model company that is responsible for the bottom line.It costs tens of millions of yuan to fully train an industry model using an open source model, and you also need to purchase AI chips and build your own computer room. The technical personnel of the above-mentioned AI software service provider concluded that the open source model is suitable for some central state-owned enterprises that have high requirements for data security and self-control and are not so sensitive to costs. They will use open source models to train their own industry models. Because "open source model + private cloud" meets the data security and self-control requirements of many central state-owned enterprises.
The debate over open and closed source for large models in the Chinese market is a purely commercial issue. However, in the international market, the debate over open and closed source for large models involves more factors such as antitrust and national interests.After the price war in May this year, China's large model calls have entered the "negative gross profit era". Open source models and closed source models face the same problem - large models cannot directly make profits."The elimination round in the large model market has already begun." A Chinese cloud vendor's large model business manager analyzed that the negative gross profit of large model calls means that the more calls are made in the short term, the greater the loss of the cloud vendor. Chinese cloud vendors are betting that after the price of large model calls is reduced by 90%, the number of large model calls will increase exponentially in the next 1-2 years. In the long run, the computing power costs of cloud vendors will be diluted as customer demand grows, and they will still be able to achieve positive profits in the end. Even if this bet does not hold, a group of model vendors will die in the price war, and the surviving vendors will pick up the pieces.Many industry insiders expressed the same view to Caixin that this round of elimination will last for one to two years, and only three to five basic model companies will continue to survive.An Xiaopeng, executive member of the China Information Technology Hundred Forum and director of the Alibaba Cloud Intelligent Technology Research Center, told Caixin in July this year that there is no 100-model war in China, or even a 10-model war. Large models require continuous investment, the ability to produce 10,000 or even 100,000 cards, and commercial returns. Many companies do not have such capabilities. In the future, there will only be three or five basic model manufacturers in the Chinese market.Who is the beneficiary of the price war? Who will have the last laugh? The AI strategic planner of the above-mentioned leading Chinese technology company believes that in this round of price war, Alibaba Cloud and ByteDance's Volcano Engine have the most blood. Alibaba Cloud can make profits from the cloud, and Volcano Engine has blood transfusions from ByteDance's advertising business. In price wars, Baidu is not as good as Alibaba and ByteDance. However, Baidu's Wenxin big model technology is strong, and there will be a group of customers willing to pay for technology. This will help Baidu withstand the price war. He further explained that these big model startups in the Chinese market will face severe tests in the next 1-2 years. Big model startups can either choose to become project-based model development companies or turn to vertical industry models.The global competition in China's large model market is far more important than the local competition between open source and closed source models. The direction of global competition will directly determine the results of local competition.An Alibaba Cloud person said frankly that both open source and closed source models have their own advantages, and Alibaba Cloud hopes to make AI more inclusive. Regardless of open source or closed source, the core purpose is to give developers more choices. Alibaba Cloud has chosen to walk on both open source and closed source, with both full-size, full-modal open source models and closed source models. Another person in charge of the large model business of a Chinese cloud vendor believes that open source has no business model. In the Chinese model market, only leading companies or a very small number of startups that can continue to raise funds can insist on open source. In the end, there may only be 1-2 open source models left in the Chinese market.Model vendors train a new generation of models almost every 6 to 12 months. In the Chinese model market, as profit pressure increases, model open source may become more and more "strategic" - companies will tend to open source the previous generation of models with backward technology and smaller parameters, and guide users to pay for closed-source models with newer technology and larger parameters.The competition between the open source model and the closed source model will not end in the short term. Some companies can even run both the open source and closed source models at the same time.This is not without precedent in the IT industry. The database has been around for more than 60 years, and the first open source database has been around for more than 50 years. The database market is still active with different closed-source and open-source databases, and new database brands continue to emerge. Database giant Oracle even has both a closed-source RDBMS database and an open-source MySQL database.Many cloud vendors’ technical experts believe that open source and closed source models will coexist for a long time. The large model market will gradually grow in the competition among different technology routes.