news

Suno is in crisis, Udio updates to v1.5, it means he is serious about making music

2024-07-27

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

Suno must be feeling a sense of crisis, Udio has released a new version v1.5, which can be said to be a head-on confrontation.
Last night Udio released a major update that includes many new features and performance enhancements.
Udio was released in April this year, with a16z as its investor. As soon as it was released, it was called Suno's biggest competitor and was also recognized by professional musicians.
Generative music has many applications, and Udio is particularly good at understanding and mastering different styles and musical genres."Music Specialty Students"
One command can get two finished products. The writing of prompt is up to you. Considering its specialty, the more specific you are when entering the music style you want in the prompt, the more amazing the effect will be.

But don’t be too stressed. Many of the works recommended on the homepage are so simple that they are scary when you look at the prompt. For example, this song in the form of a symphony has a prompt of “Beethoven Symphony”.

The default length is 30 seconds, and it can be extended by adding a beginning section, an end section, or a connecting section in the middle, which is also very friendly to music novices.

Let's start with one of the highlights of this update: multi-language support. In the official demonstration, a Mandarin demo was given.
歌词内容是怪怪的😂 不少生成式音乐的 demo 都喜欢用「人机情感」的主题,一种奇奇怪怪的执念。
However, the arrangement is very complete and the Mandarin singing is very smooth, without the awkward feeling of "foreigners singing in Chinese".
Udio's processing of vocals is remarkable. In the official comparison between v1 and v1.5, it can be clearly heard that the "AI flavor" has been reduced to an unprecedented level (although it can also be heard that a lot of reverb is used to cover it up).
also,The sound quality improvement is the most amazing of all the updates, 48kHz stereo track, the bass is particularly outstanding and very solid, and the sound quality is worthy of hifi headphones.
You know, the improvement in sound quality brings with it the possibility of giving music more layers.
Suno can also achieve good stereo sound, such as music with complex orchestral arrangements, which can also produce stereo effects.
However, this update of Udio has richer and clearer layers, the positions of different instruments can be more clearly distinguished, while maintaining the harmony of the ensemble without conflict or confusion.
The new version also introduces tonality control, which improves controllability for users with professional music knowledge.It also supports audio-to-audio generation (paid), which is the same principle as picture-to-picture generation.
Use a piece of music as a base, and then let the model generate it. Whether you are a professional (but poor) musician or an ordinary user, you can let Udio help you arrange music.
Video from Udio user@maxbarzel
If you are a regular user, Udio's generation has a feature.
When he is not familiar with the musical genre and cannot write it very clearly in the prompt, Udio often starts from a very "Disney" style.
For example, the orchestration is mainly based on orchestral music, the singing style is like that of a musical, and more importantly, it will present a key modulation method that is particularly common in Disney movie music.
This piece of music could be used as the heroine's solo in "Cinderella" or "Snow White" without any sense of incongruity.
All I can say is that Disney is, after all, a pop culture giant, and it is also a safe choice, and it can be regarded as the "greatest common denominator" in users' music aesthetics.
butDisney's legal department is also very strongUdio and Suno have previously been jointly sued by the three major music labels, alleging that the music works of their artists were collected and used as data for training models, which is an infringement.
If Udio doesn't want to face another lawsuit, he should be more careful.
Udio's performance in jazz is pretty standard. Jazz is characterized by dynamic and changeable rhythms. If it's a live performance, there are even a lot of on-the-spot changes and interpretations.So it is difficult to learn jazz through the model, but it is also understandable.
In other pop music with more distinct rhythms, there is basically no big problem.
Udio really considers itself as a music app. Compared with the way Suno's official website homepage presents songs based on keywords and popularity, Udio focuses on music genres and styles.

What, do you really want to make music?
Udio officials did not disclose technical details, but generating music with large language models has never been easy.
The language of music is difficult to describe in words, it contains a lot of information - it may be just one second, but it contains the organic integration of every beat, note, vocal and harmony.
When generating long sequences of sounds, AI models have difficulty maintaining musical continuity across phrases, lyrics, or extended passages. In addition, because music includes both vocals and instruments, it is much more difficult to generate than speech.
When pushing music to users, it must be simple and direct. Users only need to use natural language, rather than requiring every user to master professional music theory knowledge.
Udio's CEO David Ding and co-founder Charlie Nash both worked for Deepmind and participated in the development of the music model Lyria, which was released in November last year and was called the most complex music model in the world at the time.


Later, David Ding brought his colleagues together to start a business, and that’s how Udio was born.
However, as I said before, although Udio's ability is very strong, it is still not to the extent that he can replace a real person.
For example, the interval relationship and melody direction are still very awkward.This is related to the fact that large models do not have the ability to truly "understand".
Not to mention the human voice, you can find out by listening to a few more songs that there is basically no "singing" performance. It is a bit difficult to achieve the dexterous vocal range switching, vibrato, breathy voice, etc. of professional singers.
In the field of generative music,Udio More emphasis on the word "music", generation is just its tool.
While writing this review, I kept letting it resume playing automatically. When I needed to pause, I subconsciously switched to NetEase Cloud Music and wanted to press pause, but suddenly I realized that it was actually Udio that was playing the music.
This reminds me of a possible usage scenario: background music during daily work and housework, a scenario where you just need to "listen for fun".
With the current quality, it is completely feasible to use Udio's playlist to replace the existing daily recommendations and random plays. However, it is difficult for any song to impress me enough to mark a red heart.
What is really exciting is the random push notifications tied to "traffic", which is gradually declining.Let musicians get out of the vicious circle of being kidnapped by clicks and playback volume, and return to the origin of using music for expression, that would be the contribution of generative music innovation.

Text | Selina