news

audio big model unveiled at 2024 yunqi conference

2024-09-22

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

on september 19, the 2024 yunqi conference opened in hangzhou yunqi town with the theme of "cloud-driven intelligence, industrial transformation". the himalayan everest ai audio multimodal model was unveiled in the "artificial intelligence +" theme pavilion, attracting many citizens.
as of the end of last year, himalaya had accumulated 488 million audios in 459 categories, with a total content length of more than 3.6 billion minutes. the massive and diverse online audio content enables it to continuously evolve its ai capabilities. since its inception, the platform has always attached great importance to the layout of ai. the "everest ai audio multimodal large model" unveiled this time is its independently developed ai audio generation large model, which relies on more than one million hours of its own copyrighted audio data for deep learning and training. it has technical capabilities such as emotional output, natural expression, language translation, and ultra-fast cloning, and has achieved multi-dimensional breakthroughs in the field of audio generation. it is being widely used in audiobooks and other fields.
in the "mount everest ai digital human platform interactive experience area", participants can experience the changes that ai has brought to sound creation, such as: quickly experiencing 535 ai sound libraries in all fields and multiple categories to generate all-category aigc audio content, 15-second rapid customization of real-life digital human images, 10-second rapid sound cloning, etc., and feel how ai efficiently and conveniently empowers content creators.
data shows that in 2023, himalaya's average monthly active users in all scenarios will reach 303 million. as of december last year, the platform's aigc content reached 240 million minutes, accounting for 6.6% of its audio content. at the same time, the aigc penetration rate of average monthly active users on mobile terminals has reached 14.8%. the himalaya audio big model has the model advantages of "integrated production and modeling ecosystem, and continuously evolving ecological flywheel". it has been widely used in content creation, digital intelligence avatars, voice interaction and other scenarios, and has been commercialized. in the future, we will further open up the imagination of sound and continue to use sound to serve a better life.
author: fu xinxin
text: fu xinxin photos: provided by the interviewee editor: shen zhushi responsible editor: fan bing
please indicate the source when reprinting this article.
report/feedback