news

AI Weekly | The big model can’t tell which is bigger, 9.11 or 9.9; OpenAI releases lightweight model GPT-4o mini

2024-07-21

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

The big model can't tell which is bigger, 9/11 or 9/9

A math problem that is difficult for elementary school students stumped a number of AI models at home and abroad. Which is bigger, 9.11 or 9.9? On July 17, a reporter from China Business Network tested 12 large models on this issue. Among them, Ali Tongyi Qianwen, Baidu Wenxin Yiyan, Minimax and Tencent Yuanbao answered correctly, but ChatGPT-4o, Byte Doubao, Dark Side of the Moon Kimi, Zhipu Qingyan, Zero One Wanwu Wanzhi, Jieyue Xingchen Yuewen, Baichuan Intelligent Baixiaoying, and SenseTime Shangliang all answered incorrectly, and their mistakes were different. Most of the large models incorrectly compared the numbers after the decimal point in the question and answer, thinking that 9.11 was greater than 9.9.

Comment: Behind the mistakes, the poor mathematical ability of large models is a long-standing problem. Some industry insiders believe that the generative language model is designed to think more like text rather than numbers. However, targeted corpus training may gradually improve the model's ability to answer science questions in the future.

OpenAI releases lightweight model GPT-4o mini, reducing model cost by 99% in two years

On July 18th local time, OpenAI released a new lightweight large model GPT-4o mini. It is reported that GPT-4o mini will replace GPT-3.5 Turbo in the question-answering robot ChatGPT from now on, and corporate users will be able to access GPT-4o mini from next week. "We expect GPT-4o mini to expand the application of artificial intelligence and make artificial intelligence more affordable." According to an article on the OpenAI official website, the input price of GPT-4o mini is 15 cents (0.15 US dollars)/million tokens (words), and the output price is 60 cents (0.6 US dollars)/million tokens, which is 60% cheaper than GPT-3.5 Turbo. OpenAI said that the company will continue to reduce costs while improving model performance. Compared with the 2022 text-davinci-003 model, the cost of GPT-4o mini has dropped by 99%.

Comments: Although OpenAI has not yet released the next-generation model GPT-5, it is still updating the model based on the existing capabilities and continues to drive down the cost of large models. Other large model manufacturers are also promoting model lightweighting this year. Google released the lightweight model Gemini 1.5 Flash in May this year. Anthropic released the Claude 3 series in March, including the lightweight Claude 3 Haiku. Large models with smaller parameters have shown great potential this year, and performance can be improved by training with increased data volume rather than increased parameters.

Six large model manufacturers respond to the problem of poor digital capabilities of large models

Recently, the reporter of China Business Network contacted and interviewed many large model manufacturers, including Ali Tongyi, Tencent Hunyuan Team, Kimi, MiniMax Conch, Xueersi Jiuzhang, and NetEase Youdao. In the interview, they answered the question of the poor mathematics of large models. Wang Xiaoming, product manager of Ali Tongyi Laboratory, said that similar problems are common mathematical calculation and logical reasoning problems, and are also cases that developers often test during model training and use. The "right answer" or "wrong answer" of the large model is actually a probability problem. The Tencent Hunyuan team said that the large model itself is a probabilistic model, and it is difficult to make it stably solve such numerical calculation or comparison problems under various circumstances.

Comment: "Which is bigger, 9.11 or 9.9?" is not a difficult question for humans, but it is not necessarily an easy question to answer for big models. In terms of comprehensive responses, the opinions mentioned by the relevant persons in charge of big model manufacturers include that big models have not yet accurately mastered the calculation or comparison rules between numbers, and at the same time, human exploration of the capabilities of big models is in a very early stage. Many industry insiders also believe that in the future, it is necessary to enhance the intelligence level of the underlying basic models and solve such mistakes from the training data level and external tool level. The final solution may be to enhance the capabilities of the next generation of models. The discovery of such cases will help manufacturers increase their understanding of the boundaries of the capabilities of big models.

The AI ​​Act will come into force across the EU on August 1

The world's first "Artificial Intelligence Act" (EU AI Act) issued by the European Union will take effect throughout the EU on August 1. This is also the most comprehensive bill on artificial intelligence regulation issued by the world so far. The EU "Artificial Intelligence Act" also lays the foundation for global artificial intelligence regulation, aiming to achieve the same "Brussels effect" as the General Data Protection Regulation (GDPR). According to the latest bill, companies that violate the regulations will be subject to administrative fines of up to 35 million euros or up to 7% of annual revenue, whichever is higher.

Comment: The EU has always been at the forefront of technology regulation. The EU's "AI Act" is the world's first comprehensive AI regulatory bill, demonstrating the EU's foresight and leadership in the field of technology regulation. However, the rules will also increase the operating costs of enterprises. You Yunting, a partner at Shanghai Dabang Law Firm, said that since the implementation of GDPR, the costs of enterprises, especially compliance costs, have risen sharply. It is expected that the "AI Act" will be the same, which means that enterprises must invest in new regulations and appoint dedicated personnel to study compliance policies. In addition, handling violation notifications and public disclosure systems will also increase costs.

The scores of seven models after taking the "college entrance examination" are released: science majors can only go to the second-tier universities

Earlier in June, OpenCompass, a subsidiary of Shanghai Artificial Intelligence Laboratory, released the first AI test results for the entire college entrance examination, showing that AI test takers could score up to 303 points in Chinese, math and English combined, and failed in math. On July 17, OpenCompass further released a test that expanded the scope of subjects. The team tested seven large AI models in all nine subjects of the college entrance examination, so that they could be compared with the college entrance examination admission score.

If AI takes the college entrance examination, which universities can it be admitted to? OpenCompass testing found that if the big model takes the liberal arts exam, its best score can be "admitted" to a first-tier university, while if it takes the science exam, it can only be "admitted" to a second-tier university at most (based on the score line of Henan Province, which has the largest number of college entrance examination candidates this year).

Comments: Judging from the evaluation of the examiners, the current big model still has significant limitations compared to human candidates. After completing the examination, the teachers agreed that although the big model performed well in mastering basic knowledge, it was still unsatisfactory in terms of logical reasoning and flexible application of knowledge. Specifically, when answering subjective questions, the big model often failed to fully understand the question stem and did not understand the pronouns, resulting in irrelevant answers; when answering math questions, the problem-solving process was mechanical and illogical, and for geometry questions, inferences that violated spatial logic often appeared.

Fei-Fei Li incubates "unicorn", World Labs is valued at over $1 billion

On July 17, it was reported that the "spatial intelligence" startup World Labs, founded by the famous Chinese-American computer scientist Fei-Fei Li, has a valuation of over $1 billion. The startup mainly uses human-like visual data processing technology to enable AI to have advanced reasoning capabilities.

Since its establishment in April this year, World Labs has completed two rounds of financing, with investors including top technology investor Andreessen Horowitz and AI fund Radical Ventures. It is understood that the company's latest round of financing may reach about US$100 million. Fei-Fei Li, Andreessen Horowitz and Radical Ventures did not respond to requests for comment.

Comment: As a legendary figure, Fei-Fei Li's entrepreneurial moves have also attracted much attention from the industry. Fei-Fei Li became a tenured professor at Stanford's Department of Computer Science at the age of 33, and a member of the U.S. National Academy of Engineering at the age of 44. She is currently the dean of the Stanford Human-Centered Artificial Intelligence Institute (HAI). ImageNet, a benchmark achievement in the field of computer vision, was also promoted by her. She has many outstanding students, such as Andrej Karpathy, who has worked at OpenAI and Tesla, and Jim Fan, who is currently at NVIDIA, who are also influential figures in the field of AI.

AI chip and cloud giants are "snatching up" global AI companies

According to Crunchbase, a corporate service database company, global AI startup financing increased by 24% year-on-year to $35.6 billion in the first half of this year, and the second quarter was the quarter with the most AI investment in recent years. According to public data statistics from the First Financial reporter, Nvidia has invested in or acquired no less than 16 AI-related companies this year, most of which have a total financing amount of over $100 million. After Microsoft invested in OpenAI, it also participated in multiple rounds of financing with a total amount of over $100 million this year. Google has widely deployed the AI ​​ecosystem through its multiple investment platforms and participated in no less than 31 rounds of financing. Active players in this round of AI investment boom also include AMD, Amazon, SoftBank, etc.

Comment: The popularity of AI is directly reflected in investment. The investment style preferences of major giants are not exactly the same. It is worth pondering that whether it is Nvidia, AMD, which is mainly engaged in semiconductor hardware, or cloud vendors such as Microsoft, Google, and Amazon, they all want to invest in some large model vendors to a certain extent. Cloud vendors are more willing to strongly bind large model startups. Looking at the largest financings in the industry this year, it can be found that basic large models, autonomous driving, AI data and humanoid robots are the biggest hot spots.

UK launches antitrust investigation into Microsoft, Inflection AI deal

The UK antitrust regulator, the Competition and Markets Authority (CMA), recently said that it has begun a formal antitrust investigation into the Microsoft-Inflection AI transaction. In March this year, Microsoft agreed to pay AI startup Inflection AI $650 million to license its AI software. In addition, Microsoft also announced the hiring of Inflection AI co-founders Mustafa Suleyman and Karén Simonyan, as well as most of the company's employees.

Comment: Inflection AI is valued at approximately $4 billion. Industry insiders say Microsoft's behavior is equivalent to a disguised acquisition of Inflection AI at a low price. Unlike an acquisition, Inflection AI still retains its proprietary technology. In addition to the UK, it was reported last month that the US Federal Trade Commission (FTC) is also reviewing the transaction. According to reports, the FTC has issued subpoenas to Microsoft and Inflection AI, requesting relevant documents from the past two years.

Humanoid robot company Zhuji Power completes Series A financing

On July 15, a reporter from China Business News learned that Zhuji Power, a general humanoid robot startup, completed its Series A financing, led by China Merchants Venture Capital and SAIC Group's private equity investment platform Shangqi Capital, followed by old shareholders Frees Capital, Oasis Capital and Mingshi Capital. The amount of financing has not yet been disclosed. Previously, Alibaba also invested in Zhuji Power. Zhuji Power was founded in 2022. Its founder Zhang Wei is a tenured professor at the Southern University of Science and Technology. Zhuji Power's products include full-size humanoid robots, quadruped robots, bipedal robots and related solutions.

Comment: Many humanoid robot companies are still in the process of financing. This year, there have been continuous financing events in the humanoid robot track. In January this year, Xingdong Jiyuan announced that it had completed an angel round of financing of over 100 million yuan. Subsequently, Yushu Technology announced that it had completed a B2 round of financing of 1 billion yuan, Kepler Exploration Robotics completed an angel round of financing, and Galaxy General Robotics completed an angel round of financing of 700 million yuan. Internet giants Tencent, Baidu, and Alibaba have invested in UBTECH, Zhiyuan Robotics, and Zhuji Power respectively, while Meituan has invested in Galaxy General Robotics and Yushu Robotics. After humanoid robot companies have raised funds, the next focus is on how to mass produce them.

NVIDIA and Mistral AI jointly released the large model Mistral-NeMo

On July 19, NVIDIA and French startup Mistral AI released the Mistral-NeMo AI large language model, which has 12 billion parameters and a context window (the maximum number of tokens that the AI ​​model can process at one time) of 128,000 tokens. The Mistral-NeMo AI large model is mainly aimed at enterprise environments and implements artificial intelligence solutions without using a large amount of cloud resources.

Comments: Mistral AI has completed 600 million euros in financing this year, and the list of investors includes Nvidia and Samsung. Microsoft also announced an investment of 15 million euros in Mistral AI, which will be converted into equity in Mistral AI's next round of financing. Mistral AI has cooperated with Nvidia to launch large models, and the company will also balance and seek cooperation among major giants. Nvidia is getting more deeply involved in the AI ​​ecosystem, and has previously open-sourced the Nemotron-4 340B series model for developers to use to generate synthetic data for training large language models.