news

Midjourney's position is unstable? Another dark horse in AI drawing emerges, with first-hand experience of 4 products

2024-08-25

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina

AI circle, open the book image generation again.

A series of progress:

August 21,Ideogram officially launches version 2.0, claiming to have better text rendering capabilities.

That’s right, it’s the project that was founded in August last year by four big names in Google AI painting who collectively resigned and started their own business. It has received investments from a number of AI giants.

This time Ideogram alsoOpenly challenge Flux, the official confidently stated that its human evaluation is significantly better than Flux Pro.

You should know that Flux was created by the original team of Stable Diffusion, and has recently become popular on major networks for generating realistic TED speech "photos".

In addition, a week ago, Google officially releasedImagen 3In official evaluations, it is said to outperform a number of drawing models such as DALL-E 3, Midjourney v6, and Stable Diffusion 3.

Perhaps stimulated (doge), Midjourney also changed its nature and launched it directly to all users on August 22.Free web version

Now it’s time to watch something interesting!

Since everyone claims to be very strong, why don't we gather everyone together at a table and have a face-to-face PK.

Who is the strongest drawing AI?

First, let's invite our four contestants (all using the web version):

Contestant No. 1: Ideogram 2.0.There are 10 points for free every day. 1 point can generate 4 pictures, and a maximum of 40 pictures can be generated every day.

Player No. 2: Flux.1.Black Forest officially provides a free demo in Hugging Chat (select FLUX.1 Schnell version);

Contestant No. 3: Imagen 3.It is free and can be used unlimited times on Image FX;

Contestant No. 4: Midjourney.During the free trial period, you can only generate 25 images in total;

Now we will officially enter the competition phase.

Black Monkey is all overturned

First of all, in order to test these foreign AIDo you understand the Chinese prompts?, let’s also take advantage of the popularity of the current top-tier Black Monkey.

prompt: The game character is a monkey, wearing armor, a golden crown with phoenix feathers on his head, holding a golden hoop in his hand, standing on a cliff.

As expected, an accident happened...

I believe everyone is attracted by the big red cross on Imagen 3. That’s right, under the same prompt, only Imagen 3Build request rejected

Seeing this, my first reaction was that our prompt word might have triggered copyright protection. So I deleted the word "game character" from the prompt word, but the prompt still failed to generate.

Could it be that Google Imagen 3 does not support Chinese? So I randomly changed the prompt word to a simpler one, and now there is a picture.

But the result was a huge mistake, and even after changing multiple Chinese prompts, they ended up with some unrelated texture images.

It seemsGoogle Imagen 3 is not good for Chinese prompt words

After No. 3 failed, we looked at the others and found that onlyIdeogram 2.0 No. 1 performs best

No. 2 still has some traces of Chinese comics, but No. 4 Midjourney is completely free~ (the main theme is completely irrelevant)

Finally, I have to praise Ideogram 2.0 for hitting all the key elements accurately.

Although it is not what I want in my heart (I want Black Myth), the restoration of the prompt words is indeed correct.

Is it a real person or AI? I can't tell

Next, we will enter the comfort zone of each player——Portrait Generation

Back then, Midjourney went viral with a photo of a couple on a rooftop; now, Flux is taking the internet by storm with a set of TED talk photos…

Who is better? The answer will be revealed soon.

prompt:A young man with auburn hair, wearing a checkered shirt in teal and cream, captured with a 50mm lens for a vintage look. Rich colors, sharp focus, and a touch of retro charm.

A young auburn-haired male wearing a teal-cream checkered shirt captured in retro style with a 50mm lens. The colors are rich, the focus is sharp, and there's a hint of vintage charm.

Looking at number 2 and number 4 first, it is obvious that Midjourney wins!

From the details, No. 2Flux.1 has a slight deviation, there are two more colors of clothes, which is particularly prominent among the blue and green plaid shirts.

In addition, we also found Imagen 3A unique little highlight: Circle keywords before generation begins.

Through the work it has done, we can just test several playersKey Elements(blue-green checkered shirt, 50mm lens, etc.)

It can be seen that overall the performances of the contestants were good (except No. 2), with a high degree of restoration and they all looked towards the camera.

Moreover, if these were not generated by me personally using AI, I would not be able to tell the difference between them and real people at once. (Shame on you)

Finally, I have to say in secret that contestant No. 4 Midjourney has the best looks.

The hard part: showing text on images

After successfully fooling everyone, it’s time to teach AI a lesson.

Add text to pictures

This has always been a difficult problem and has become one of the standards for testing the level of AI image creation.

Without further ado, let's ask the contestants to make a beautiful billboard. Please put yourself in the role of the party A's father.

prompt:A horizontal brass sign reading ‘Festive Season’ in a stylish script, encircled by pine and holly on a dark wood backdrop, with a close-up focus on the golden lettering.

A horizontal brass sign with 'Festive Season' written on it in a stylish handwritten font surrounded by pine branches and holly against a dark wood background, close-up focusing on the gold lettering.

After a quick glance, aren’t they all pretty good? They seem to have highly restored the prompt words?

However, once the client's sharp eyes are revealed, No. 2 can no longer hide.

Watch out, number 2.Flux.1 cut corners, the word "Season" is missing the letter "S".

But except for No. 2, the others are pretty good. It seems that the AIs of various companies areText rendering functionI have put in a lot of effort.

So the next thing is, everyone has their own preferences, and everyone can make a choice based on their personal preferences. (I voted for Midjourney privately)

By the way, the upgraded model of Ideogram No. 1 also specifically promotes the "text rendering" function, so you may want to try it.

Take McDonald's as an example, AI advertising on the hour

Recently, McDonald's hired 11 AI beauties to cheer for French fries, which became a big hit.

In fact, the principle is quite simple. It is nothing more than using AI to generate pictures of different characters promoting French fries, and then stitching them together into a video.

Unexpectedly, the effect was amazing. On Twitter alone, the related video received nearly 10 million views.

After mastering the code of wealth, let's start working. As Chinese people,AI helps farmersHigh and low have to walk ~

prompt:Against the backdrop of a cyberpunk-style metropolis, a girl is promoting organic agricultural products in her hands.

Against a cyberpunk-style urban background, a girl is promoting the organic produce in her hands.

Very good, player 3 once again "played badly". But this time it was really puzzling, the prompt words were not in Chinese, and there was no obvious prohibition...

After No. 3 was eliminated, contestant No. 1 Ideogram 2.0 brought the most diverse products, including cabbage, tomatoes, purple cabbage, etc.

And it is the only one that hitsText signThose who come to promote organic food, it can be seen that they are working hard~

In addition, if you look closely, you can see that only No. 1 is trying his best to imitate the real person, while No. 2 and No. 4 are completely on the same track.Two-spined monkey

u1s1, if we refer to the advertising style of Maimai, this short generation did not achieve the ideal effect. (I hope it is closer to reality)

Fortunately, these AI tools are currently free to use, so it’s not impossible to use them more often. The key is still the methodology. 🐶

Don't rush to leave, there is actually a more reliable way to make money -

Easily control studio photography with AICommercial posterWouldn’t it be nice to save money on hiring a photographer, venue, and post-production?

A sleek lipstick tube gleams against a backdrop of sophistication, highlighting the rich pigment and smooth glide. Evoke luxury with sharp focus and a hint of shimmer.

A sleek lipstick tube shines against a sophisticated backdrop, highlighting its rich color and smooth application. Evoking a sense of luxury with sharp focus and a hint of shimmer.

Let me ask you, if you were to pick a lipstick for a woman around you, which one would you choose? (Here comes the death test)

Hey, so has anyone chosen number 4?

Although the No. 4 Midjourney looks very high-end, the black one may be a bit niche. (Choose carefully)

Besides this, the next best performer isImage 3The velvet fabric underneath adds a sense of luxury, and most importantly, the texture of the lipstick is very real.

In comparison, both No. 1 and No. 2 look a bit fake and have a plastic feel.

Therefore, in this round, player No. 3 won overall.

In summary, the four contestants performed very well overall. Under the Chinese prompts, the dark horse contestant Ideogram 2.0 performed the best.

Who is Ideogram?

In February this year, Ideogram launched version 1.0. In just half a year, it evolved again and launched version 2.0.

In fact, Ideogram and Google are closely related.

Founded in August last year, the first four members of the founding team areGoogle Imagen Research Paper Author

CEO Mohammad Norouzi, co-first author of the paper, received the Google ML PhD Scholarship while studying for a PhD in Computer Science at the University of Toronto.

After graduation, he joined Google Brain and worked there for seven years, rising to the position of senior research scientist. His main research wasGenerative Models

In addition, he is also an original member of the Google Neural Machine Translation team and a co-author of Hinton's team's self-supervised contrastive learning framework SimCLR.

CTO William Chan(Chen Junle), co-author of the paper, studied at the University of Waterloo and Carnegie Mellon University in Canada.

When he joined Google in 2012, he first worked on machine learning advertising projects, and later switched to Google Brain to do NLP research.

Jonathan Ho, Co-Founder, graduated with a Ph.D. from UC Berkeley, worked at OpenAI for a year, and then joined Google.

In addition to being a core contributor to the Imagen paper, he also laid the foundation for the denoising diffusion model.《Denoising Diffusion Probabilistic Models》Pieter Abbeel, the first author of the paper and a co-author of the paper, is also an investor in Ideogram AI.

Chitwan Saharia, Co-founder, co-first author of the paper, graduated from the Institute of Technology Bombay with a bachelor's degree. He joined Google in 2019 and is mainly responsible for leading the work on image-to-image diffusion models at Google.

The other three members of the founding team,Shayaan AbdullahHe was a machine learning engineer at Twitter, but left the company in April last year and joined Ideogram AI.

Jacob LuA software engineer, he worked at Amazon and other companies before joining Ideogram;Jenny Leiis a software engineering intern who interned at Google before joining Ideogram AI.

It can be seen that Ideogram is composed of a top diffusion model research team and has won the favor of capital since its inception.

Ideogram's seed round was led bya16zandIndex VenturesLed the investment with an amount of US$16.5 million (approximately RMB 120 million at the time).

Among the individual investors are Andrej Karpathy, reinforcement learning expert Pieter Abbeel, and GitHub co-founder Tom Preston-Werner.

In addition, in February this year, multiple sources reported that Ideogram hadNew round of financing

It is said that it successfully raised$80 million(about RMB 5.7 billion) Series A financing, led by Andreessen Horowitz, and other participating investors include Index Ventures, Redpoint Ventures, Pear VC and SV Angel.

It seems that Ideogram, which has money and technology, is undoubtedly a dark horse in the field of AI raw images.

Roll, keep rolling.