Midjourney is without a doubt one of the most popular AI image generators. It rose to prominence in 2022 with its high-quality outputs that seemed to surpass other AI art platforms like Stable Diffusion and DALL-E 2. This has led many to wonder – does Midjourney actually run on Stable Diffusion behind the scenes? Or does it use its own proprietary model?
The truth is, we don’t know for sure since Midjourney is a closed, proprietary system. But there are some clues that point to Midjourney having some connection to Stable Diffusion, though likely in the form of an adapted proprietary model.
The Midjourney Beta
In mid-2022, Midjourney ran a short beta test that was confirmed to be using Stable Diffusion. Users reported seeing hallmark Stable Diffusion artifacts like double jaws in some outputs.
However, this beta only lasted a couple days before Midjourney moved on to other versions. As of Midjourney v4, rumors are that the Stable Diffusion model is no longer used. So while we know Stable Diffusion was experimented with, it doesn’t mean it remained in the final Midjourney product we see today!
Both Midjourney and Stable Diffusion are examples of “diffusion models” – a type of deep learning generative model. So it’s likely they share some architectural similarities even if Midjourney has customized and improved on the open source Stable Diffusion version.
Midjourney outputs tend to look cleaner and more artistic than typical Stable Diffusion images. This is likely due in part to Midjourney’s proprietary prompt engineering system.
The Midjourney interface takes in a simple user prompt and passes it through natural language processing algorithms to expand and refine it before feeding it into the art generator. This allows it to turn a short phrase into a detailed prompt optimized for the model. It’s for this reason that people tend to notice writing prompts for Stable Diffusion is much more complex than those used with Midjourney.
While Stable Diffusion was trained on a broad dataset scraped from the internet, Midjourney seems to have curated a more niche, aesthetic-focused dataset. This artistic data, along with further fine-tuning, gives Midjourney outputs a signature polished look.
With the recent release of SDXL, we can expect to see an updated version of the Midjourney-style model built for Stable Diffusion, which should be able to emulate Midjourney outputs much more accurately.
Given the vast resources of Midjourney and the size of their training data, it’s likely they have continued to evolve and adapt a version of Stable Diffusion into a larger, more capable custom model. With access to top-tier GPUs and datasets, they could have scaled up network size, added external knowledge sources, and done proprietary fine-tuning well beyond the public Stable Diffusion version.
Ultimately, it’s impossible to definitively know if Midjourney uses Stable Diffusion since it is proprietary closed source software. Without access to inspect the code and models, all we can do is speculate based on output similarities and discussions from the Midjourney team.
So while Stable Diffusion may have provided an initial springboard, the current Midjourney architecture is probably several generations removed from public Stable Diffusion models available today.