
Zhang Jian, CTO of Communication Brain: Accelerating the last mile of big model application for media


한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Huang Yunling, reporter of Chao News Client

In the past decade, as a national strategy, media convergence has been promoted as a whole. Emerging technologies have continuously enabled the development of media convergence, especially generative artificial intelligence, which has played an increasingly important role in the media field. On August 23, the "BIRTV2024AIGC Generative Artificial Intelligence Innovation and Application Technology Exchange Conference" was held in Beijing. Zhang Jian, deputy general manager and chief technology officer of Communication Brain, was invited to give a keynote speech on "Accelerating the Application of Media Big Models".

The capabilities of generative artificial intelligence are naturally aligned with the needs of the media industry. Media organizations need to accelerate the implementation of large models, and to open up the last mile of the implementation process, they need to translate the media's numerous complex multi-task requirements into simple single-task instructions in series or parallel, and seamlessly integrate scenario-based large-model instruction set capabilities into the editing and production system.

So what practical information did Zhang Jian share at this conference?

Zhang Jian, deputy general manager and chief technology officer of Communication Brain, gave a speech. Image provided by Communication Brain

The "exclusive presence" of big models in the media industry

From the emergence of Microsoft XiaoIce AI poetry collection in 2017 to the emergence of multimodal large models such as GPT4 and Sora, a large number of deep learning methods have been proposed and iteratively updated, setting off a wave of AIGC development. Generative AI focuses on producing content such as pictures, text, and videos, which perfectly matches the daily content production needs of the media, and the media industry has ushered in opportunities for change.

However, with the widespread application of big models in the media industry, some issues have gradually surfaced, such as whether to develop a general big model at the bottom layer in-house? Should we make a privately deployed big model to prevent data leakage? How can big models be integrated with the editing and production systems of media production?

In response to these problems, Zhang Jian believes that the media industry does not need to "roll up" the underlying large model, and it is more important to do a good job of application. "The deployment of large models is not a problem, and computing power and data are not difficult. The key is how to continuously improve the capabilities after deployment."

Zhang Jian said at the meeting that many big models are not fully developed and applied based on the needs of specific scenarios, and are not personalized for the needs of different user groups, resulting in impractical services. At present, the application of big models is mainly to solve single simple tasks, while the needs of the media are complex multi-task series or parallel. The specialization of prompt word engineering cannot solve the current situation of the separation of content, process and big model system. Translating the complex multi-task needs of the media into a simple single-task process and seamlessly integrating the scenario-based instruction set capabilities into the editing system is the future trend of media big models.

The shift from the universal big model to the media big model

In the media creation scenario, the general large model can only produce a single text, not pictures and text, and the generated video also has copyright risks. The application of the general large model obviously still has problems such as not being deeply rooted in the media business process, lacking scenario adaptation and personalization.

In this regard, the Communication Brain combines the news data media resource library with the general large model, and after fine-tuning, creates a large communication model that targets media verticals, providing the media with five major services: intelligent creation, intelligent review, creative design, multimodal retrieval, and intelligent dialogue. According to reports, in February 2024, the Communication Big Model was put online for filing through generative artificial intelligence (large language model), becoming the first media vertical large model developed by a media technology company to pass filing. In August, the Communication Brain content generation algorithm passed the algorithm filing of the Cyberspace Administration of China.

Spread the brain

Zhang Jian said that the big model has three major features: "First, it can be trained using high-quality news data, which can effectively alleviate the fabrication of big model content generation and ensure that the content is authentic, safe and controllable; second, it can connect private domain media resource libraries and use big models to achieve fast and accurate retrieval of copyrighted content, ensuring the copyright of content materials is reliable; third, it can create the most professional big model application supermarket in the media industry based on user roles, business scenarios and content modes, ensuring that it is simple and convenient to use."

Based on the above three characteristics, the communication big model has also formed its own four unique advantages: "professional media knowledge base", "full business scenario coverage", "full creative process access" and "multimodal content support", which helps the media to accelerate the last mile of the big model's application.

At present, the communication model mainly focuses on two application scenarios: media client and content production platform. In the media client scenario, the communication model can provide personalized and intelligent voice communication and comment services; in the content production scenario, the communication model has richer capabilities and can provide news writing, video generation, multimodal retrieval, AI poster generation, image material generation, content intelligent review and other functions.

The development trend of big models in the media industry

With the continuous advancement of big model technology and its integration with the media industry, Zhang Jian analyzed the future trend of big models in the media industry. He believes that using big models as the core driving force for the development of media business is the trend of the media industry in the future. For the media industry, it is necessary to focus on users, take business scenario demands as a foothold, use big model technology, and transform digital technology services.

The development goal of Communication Brain is to open up the last mile of AI application. Communication Brain will build the next generation of media application supermarket based on user roles, business scenarios and content modes, and ensure that each independent application scenario is closely integrated with the production-side business, seamlessly integrated into and optimized the entire media workflow, thereby improving the efficiency and quality of media work.

The continuous maturity of big model technology and the expansion of application scenarios will usher in a new era of more efficient and intelligent development for the media industry. Communication Brain is committed to becoming an important promoter and enabler of this process.

"Please indicate the source when reprinting"
