news

Magic modification of "Black Myth: Wukong", defeating Midjourney, this AI raw picture artifact is addictive

2024-08-23

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

When AI text and images compete in terms of realism and artistry, Ideogram has opened up a tricky track: accurately generating text on images with beautiful fonts and layout.

This demand is not uncommon.Generate posters and illustrations with one click, without the need for Photoshop, which can save a lot of trouble and is very suitable for ordinary people who know nothing about design.

We previously wrote about version 1.0 of Ideogram. On August 21, version 2.0 was released, with better realism, more design-oriented posters, and stronger text.

You may never have heard of it. It is an AI product developed by a former Google employee. It has many shortcomings, but its long board can "overtake" Midjourney.

AI wants to know which Wukong you are talking about

Ideogram has a particularly beginner-friendly feature: "Magic Hints".

You can directly input the Chinese prompt words, and it will help you translate them into English and optimize them at the same time. As an overseas product, such an operation can win people's hearts.

At the same time, Ideogram focuses on five styles:Normal, Realistic, Design, 3D, Animation, they are all easy to understand, so choosing won’t make people entangled.

Let’s start with a simple Chinese prompt, “Sun Wukong holding the golden cudgel”, in anime style, and let AI help me translate and optimize it to see what it can do freely.

When Shui Lingling's Dragon Ball version of Goku came out, I was surprised. I looked at the prompt words.AI translates Sun Wukong into "Son Goku", then it's not surprising.

Moreover, I also want to ask Ideogram if it is too blatant and whether they have paid copyright fees.

In order to avoid AI misunderstanding again, when entering the prompt word, I did not be lazy and emphasized that "Sun Wukong" is "Sun Wukong", not "Son Goku".

This time, a realistic style is adopted, and a relatively detailed scene is specified. The Monkey King is wearing armor, holding a golden hoop, with a solemn and majestic expression, standing in front of a Buddhist cave. The picture has a warm orange tone, and "Black Myth: Wukong" is written at the bottom of the picture.

There are no mistakes in the text, the capital letters have a strong impact, and the atmosphere of the Buddhist cave is also created to some extent, but the temperament of the "Monkey King" is a bit lacking, the image is a bit regressive, and there is no light in his eyes.

Use the same prompt words Midjourney After generating it once, the text has both errors and lacks design sense, but the slightly more handsome "Monkey King" and the style of the web game make up for this.

Midjourney Generation

Unwilling to give up, I tried the 3D style again. The prompt remained basically the same, except that the text at the bottom was changed to "Game launch on August 20th."

As a result, the result generated by Ideogram is very similar to the promotional picture of a certain Chinese-style Q-version blind box series. The picture is processed very cleanly, but it is completely different from the 3D game style in my mind. The Monkey King is also drawn to look like Erlang Shen.

and AI also exposed itself in this. Although it was adept at rendering English text, it knew nothing about Chinese. This defect continued from 1.0 to 2.0.

It seems that overseas products do not understand domestic traditional culture well enough. Ideogram’s performance in the first round was somewhat disappointing, but still interesting.

The Ideogram team said that version 2.0 is not inferior to Flux and DallE. Recently, the TED speech photos generated by Flux's real version LoRA deceived many netizens because it was difficult to distinguish the real from the fake. So let's test it.Ideogram produces results that look very much like photographs.

Flux Generation

After choosing the realistic style, I entered the Chinese prompt words, a photo of a TED speech, and a slide titled "Ideogram 2.0 Released" with three key points: "Accurate text", "Good at design" and "More realistic". A female speaker was standing in front of a whiteboard with several people in the background.

It can be seen that Ideogram has a good semantic understanding and has all the necessary elements. The TED logo is almost indistinguishable from the real thing. The expressions of the speaker and the audience are very vivid, and the hair and skin are relatively natural.

but,The details are not handled well enough. Although there is no problem with the required generated text, some randomly appearing small characters ruin the whole thing, and the character's fingers and body are not quite right either. But it is much better than the previous 1.0 version.

As for poster design, it can be said that Ideogram has surpassed the "comfort zone" of other literary image AIs.

If we take the extremely popular "Alien: Reaper" as an exam question, can AI design that indescribable feeling of horror?

I chose the design style, used prompts to describe the elements of the picture, and specifically mentioned that there was a sentence at the bottom of the poster: "Minors should watch with caution."

The overall effect is eye-catching, and the long string of text is successfully generated with only one minor error. However, it is particularly unrealistic and more like the style of American comics, which does not quite match the live-action movie.

I took the summer bad movie "A Dream of Red Mansions" as inspiration and asked Ideogram to generate a poster. The background, decorations and even characters written in the prompt words were all included. Once again, I sighed that the follow-up of the prompt words was really good.

Of course, the title is written correctly, but the font seems to be borrowed from the Lord of the Rings, which is a bit out of place. The overall style is more like the Mulan animated film.

Ideogram's "design style" tends to be two-dimensional and unique, but on the other hand, this also limits the usage scenarios of the poster.

To sum up,Ideogram is an AI graphic product with its own unique features. Its level of realism is similar to Flux, and its artistic sense is different from that of Midjourney.

"Rainy Summer" pattern

butThe text generation level is unique and is more suitable for generating posters, illustrations, advertisements, emoticons, T-shirt printing, etc.

The results of human evaluation show that Ideogram 2.0 performs better than Flux Pro and DALL·E 3 in terms of cue word alignment, overall performance, and text rendering quality.

But this is what Ideogram says

Highly playable and down-to-earth, such AI "desserts" may wish to have more

Ideogram was announced on August 22 last year, exactly one year after the release of 2.0.

The founding team consists of seven people from Google Brain, University of California, Berkeley, Carnegie Mellon University and University of Toronto, four of whom are authors of the Google Imagen research paper.

In addition to releasing 2.0, Ideogram also launched an iOS app, which can be downloaded directly in China. The Android version is planned to be released later. From web pages to mobile devices, we can generate pictures anytime and anywhere.

Mobile phone interface

Ideogram is currently open to all users for free, but the quota is very limited. After generating 20 photos 5 times, Ideogram reminded me that my 10 points had been used up and asked me to come back tomorrow.(Of course, the 25 photos generated by Midjourney next door for free don’t seem that impressive.)

If you rarely come into contact with Vincent pictures, you want aWenshengtu AI For beginners, Ideogram is a good choice.

Entering Chinese prompts, using "magic prompts" to translate and optimize is one aspect. In addition, Ideogram also has many options to help you generate images that are closer to what you have in mind.

Provide a limited range of options for users to "click" on,Compared to completing "input" in a blank input box, the interaction is simpler.Ideogram allows you to choose whatever image ratio, style, and tone you want.

Different shades of "Girl with a Pearl Earring Eating McDonald's"

If you don’t know how to write the prompt words, you can also draw them and let Ideogram help you turn decay into magic.

I'm sorry for my poor drawing skills, but AI can understand the meaning, optimize the lines and colors, and add a background, which instantly improves the style. With AI, everyone can be like Ma Liang, the magic painter.

In addition, below the input box on the web version are works generated by others. If we encounter something we like, we can check and refer to the prompt words. Ideogram said that their users have generated more than 1 billion publicly visible pictures in the past year.

If you want to generate a specific object but don't know how to write the prompt word, Ideogram has also launchedThe ability to search public creative repositories by text, but this feature currently requires a membership.

Search results for "cat"

All in all, Ideogram is a highly playable Wenshengtu product.

It can generate the text content required by users relatively accurately and adapt to pictures of various styles, and has a wide range of employment fields.

Image source: Ideogram blog

Occasionally, it can also bring emotional value and express one's thoughts through pictures, although the emoticons made are too biased towards the aesthetics of the European and American Internet.

"I want to play Black Myth: Wukong" emoticon pack

Ideogram’s overall quality is not bad, the text function is powerful, it is friendly to newbies, easy to use, and the interaction is also pleasant. When AI tools combine creativity, convenience, and sharing value, it is easy to get addicted.

A world that is carved out of a mold is too boring. It is also very interesting to have insight into a small need and then be the best in the industry in solving the solution.

There are many products in the world and even more audiences, so we can expect more AI "desserts" like this.