news

It was crowned as the strongest when it first came out, but this image AI seems to be just that.

2024-08-14

한어Русский языкEnglishFrançaisIndonesianSanskrit日本語DeutschPortuguêsΕλληνικάespañolItalianoSuomalainenLatina


Do you still remember Stable Diffusion, the image generation AI that was once as famous as DALL·E and Midjourney?

ExperiencedThe boss ran awayCore development members leaveStability AI, which once stirred up the image AI market, is now in chaos internally.

In recent months, about itBroken capital chain, seeking to sellThe news about this has never stopped.


When the former team was in a desperate situation and was trying to save itself, the group of members who left at the beginning of the year were just the right ones.Stable DiffusionEstablished a new companyBlack Forest Lab(Black Forest Laboratory).

At the beginning of this month, when they announced the establishment of a new company, they also releasedThreeDifferent volumes of Wenshengtu model FLUX.1.

There is the large cup pro that focuses on image quality; the medium cup dev that takes into account both speed and image quality; and the small cup schnell, which is known as the "speed whirlwind".

According to their official website, the large and medium cups of FLUX have become the most popular AI in all images.The most powerful existence


The various capabilities that have been split out, such as visual quality, size variability, output diversity, etc., are also much stronger than other models.


Not only the official said so, but also many netizens and media said that the newly released FLUX has already reachedPunch Midjourney, kick DALL·Edegree.


Looking at the comments online, Shichao's interest was piqued. Is FLUX really as good as everyone says? This time we broughtMidjourney, andLarge cup of FLUXPut them together and test them.

Let's start with a routine test to warm up, and ask them to draw a Chinese ink painting respectively.

The results were pretty good, with all the content in the prompts, such as the fisherman, mountains, reeds, etc. However, the sun drawn by Midjourney was a bit too big, and there was no sunset feeling.

Tips:

Chinese ink painting style, a lone fisherman in a traditional wooden boat, drifting gently on a tranquil lake at sunset, Chinese ink painting style, warm blue tones reflecting the calm water, soft brushstrokes capturing the tranquility of the evening, mountains in the distance silhouetted in the fading light, traditional huts on the shore, reeds swaying in the breeze, 8K resolution, cinematic feel, nostalgic and peaceful atmosphere

FLUX (left), Midjourney (right)


Friends who are interested in image AI should know that"Poor text generation"This is where almost all AIs have their problems. DALL·E has optimized this weakness before, but it still makes mistakes occasionally.

This time, it is said that FLUX's ability in this area has reached perfection, so I specifically selected a few prompt words to generate text and threw them to it and Midjorney.

First, let them each generate a bag with the Prada logo. The answers they gave in the end were pretty good, and there were no errors in the text.

The overall effect of the picture isEach has its own merits, FLUX not only wrote the words correctly, but even drew the inverted triangle logo of Prada, while Midjourney looks more fashionable.

Prompt words: a large white "Prada" handbag, small figures made of ice, surrounded by snow, styled like a fashion ad, inspired by Prouce magazine ads, high resolution photography, ad-inspired print design style

FLUX (left), Midjourney (right)


NextIncrease the difficulty, ask them to design a retro picture for the short-sleeved shirt, and add two English words.

Neither of them made any big mistakes this time, but in terms of the overall effect, Shichao personally thinks that Midjourney's is better.

Words: The retro-style t-shirt design features a vintage drag racer with a checkered flag and the words "Lagertha" and "Semper Fi" against a monochrome background. Lagertha, who is holding the flag, has tattoos on her body. The style of the artwork captures her action pose, showing the power of speed and Viking strength. It's a high-contrast illustration that highlights their sporty attire and the bold typography.

FLUX (left), Midjourney (right)


Look at the capabilities of an image model, a classic problem"Painter"There is definitely no way around it.

Midjourney is still a little unstable. The generated hands are sometimes good and sometimes bad. For example, in the picture on the right, the "yeah" gesture is inexplicablyAn extra little finger

Both images were generated by Midjourney


To be honest, the effect of FLUX is quite surprising, whether it is clipart style or realistic style.There are almost no flaws in the hands.

Both images are generated by FLUX


So far, FLUX has done a good job handling some image details and minor issues.

Of course, to a certain extent, image AI is also a tool to help everyone realize their imagination, so Shichao lost some moreBrain-opening tips

Cue: A young girl in a red dress, sitting next to a giant dragon with huge teeth and eyes. She is facing it, as if they are friends or good cops. This scene takes place in the snowy rocks in the mountains. Shot in the style of James Cameron's The Secret Life of Wolves, a 70s movie.

FLUX (left), Midjourney (right)


Emmm... who is good or bad, we don't need Shichao to make a conclusion for everyone, FLUX is basicallyAI at a GlanceOn the other hand, Midjourney really has a bit of a live-action special effects flavor.

Afterwards, Shichao gave FLUX a simpler prompt: "The destruction of modern civilization" to see how its imagination worked.

As a result, this time,It and Midjourney both stumbled

Judging from the pictures alone, Midjourney is better, and it really captures the epic feeling, but looking at this building from the front and back, it has nothing to do with modern civilization...

FLUX (left), Midjourney (right)


Interestingly, FLUX is quite good at generating exaggerated caricatures of celebrity portraits. For example, when it generates portraits of Musk and Jobs, the facial features are captured quite accurately.

Both images are generated by FLUX


After the overall experience, Shichao feels that the real level of FLUX is stillNot to mention being the bestBut it's not far off.

After all, it was made by the original team of Stable Diffusion, and is almost in the same league as Midjourney.

Moreover, when the new company Black Forest launched FLUX at the beginning of the month, it also officially announced its own financing progress.$31 millionof financing.

More importantly, although everyone at Black Forest has left Stability AI, it still inherits its traditional virtue of open source, and both the medium and small cups of FLUX are open source.

This is not the end. The launch of image AI seems to be just one part of their progress. On the official website, they also pointed out the next plan, which is to doSOTA in Video AI


But then again, the commercialization of image AI is a topic that has been discussed to death.

Black Forest's former company, Stability AI, was in a mess because of commercialization issues. And Black Forest itself, now open source, and the paid model set is basically the same as the previous Stability AI.

We can only wait and see whether there will be any other new moves in commercialization in the future, after all, it has just been released.

Don’t repeat the old path of Stability AI…

Written by:squirrel

edit: Jiang Jiang

Art: Xuanxuan

Image, source

FLUX、MidJourney