Midjourney v6 Adds Text and Delivers More Realistic Results

Midjourney v6 released

Midjourney v6 is now available, and the upgraded model promises improved realism, the ability to create convincing text on an image, more specificity in prompts, and more.

The latest iteration of Midjourney is available on Discord and is currently in “alpha.” However, early testing shows a dramatic improvement across the board, save for speed, which has noticeably slowed.

As users begin testing Midjourney v6, they are discovering new changes to the popular AI image generation platform. On Reddit’s r/singularity subreddit, a community heavily focused on AI technology, users are compiling a list of changes they have found so far, including longer prompts, the ability to specify colors and other details in an image, composition controlled using natural language, adding text, improved understanding of grammatical nuances, the ability to add frames and borders to pictures through descriptive text, and more.

Midjourney v6 released
“A fluffy puppy playing in the snow on a moonlit night in the mountains” — Midjourney v5.2
Midjourney v6 released
“A fluffy puppy playing in the snow on a moonlit night in the mountains” — Midjourney v6

Tom’s Guide has been working with Midjourney v6 as well, determining that users can now interact with Midjourney more like ChatGPT and fine-tune their images through conversation. And much to the joy of punctuation fans everywhere, Midjourney can reportedly now understand the famous “Eats, shoots, and leaves” conundrum.

As for the results, Midjourney v6, even in its alpha state, is undoubtedly better than the most recent version, v5.2. One of the improvements Tom’s Guide noted is that Midjourney is better at generating images of real people. Whether that is good is beside the point for now, but the claim seems accurate based on PetaPixel‘s limited testing.

Consider the images of “Albert Einstein doing a science experiment” below. The v5.2 images seen first are a mixed bag, especially concerning Einstein’s iconic hair.

Midjourney v6 released
“Albert Einstein doing a science experiment” — Midjourney v5.2

As for v6, Einstein looks more like Einstein, especially relative to many of the images of the famed scientist that people have seen. His hair is more nuanced, his skin looks more natural, and the lighting in each image is significantly more realistic. It is also noteworthy that the scene as a whole seems much more grounded, rather than the sort of cartoonish “science experiments” generated by Midjourney v5.2.

Midjourney v6 released
“Albert Einstein doing a science experiment” — Midjourney v6

Keeping the science theme, how does Midjourney generate a “portrait of Marie Curie in her lab”?

Midjourney v6 released
“Portrait of Marie Curie in her lab” — Midjourney v5.2
Midjourney v6 released
“Portrait of Marie Curie in her lab” — Midjourney v6

The v6 results are spectacular. There is an immense amount of detail in each image, yet again, the lighting is excellent. The v5.2 results are not necessarily bad, but they are all very stylized, and none really showcase a realistic-looking Curie or scientific lab setting. The relative complexity of the scene in v6 is really impressive.

An improvement in realism is seen in other prompts. Consider “a child looking at an insect through a magnifying glass.” Both versions look good and struggle in similar ways. A significant issue is that Midjourney, whether v5.2 or v6, seems to lack an understanding of how people use magnifying glasses.

Midjourney v6 released
“Child looking at an insect through a magnifying glass” — Midjourney v5.2
Midjourney v6 released
“Child looking at an insect through a magnifying glass” — Midjourney v6

Adding text is a big draw in v6. The results are so-so, but a dramatic improvement over v5.2, which is inept with text.

Midjourney v6 released
“A movie poster for a film about photography called ‘Photography.'” — Midjourney v5.2
Midjourney v6 released
“A movie poster for a film about photography called ‘Photography.'” — Midjourney v6

Generative AI has a lot of general issues that it must overcome, not the least of which are remarkable biases. Tasked with generating an image of “a woman at work,” Midjourney v5.2 decided to generate only thin, young white women and two of them are working at a sewing machine. They all have messy workspaces, too, which is just sort of quirky.

Midjourney v6 released
“A woman at work” — Midjourney v5.2

Will Midjourney v6 branch out, create a woman who is not cut from the same cloth? Not really, although the women do look more realistic. Of course, with improved prompting, the user can fine-tune their results, but, interestingly, the default is always practically the same.

Midjourney v6 released
“A woman at work” — Midjourney v6

This is not precisely Midjourney’s fault, of course. The image generator, like all others, has been trained on existing images. If biases are in the training set, those will proliferate through to the final generator. Further, suppose these biases are not carefully corrected at various points in the process and make it to the final public release of a platform. In that case, they will be present in the many images created by users. And where do some of those images end up? Back in the training set, of course.

So far, Midjourney v6 looks to be another significant step up from Midjourney v5.2, which was already impressive in many situations. While there is still ample room for improvement, especially with the new text functionality, the popular AI image generator continues to improve rapidly.


Disclosure: All images in this article have been generated using Midjourney, a generative AI platform.

Discussion