Dall-E 2 does a really great job at photorealism, but I’ve also found it’s great for editing images. You can import a PNG, erase a section you don’t like, and Dall-E 2 will fill in the erased area. At 10 cents per 4 generations, it’s totally worth the cost of credits for this feature alone. Unlike other text-to-image generators, it handles objects in perspective a lot better, and seems to understand scale with the placement of objects at various distances.
I started off with Disco Diffusion, and it’s really fun, but it takes a lot longer to get generations. But you have custom control over dimensions and you can run queues that run for hours, which Dall-E 2 does not do. It’s also much more DIY in nature, in that you’ve got access to a lot of variables that you can tweak, and custom trained models you can import.
In between these in terms of ease-of-use versus control is MidJourney, which produces more painterly and concept art results than Dall-E 2, and out-of-box does a better job with people, faces, hands, etc. It’s ability to mimic different drawing and painting styles exceeds Dall-E 2 imho.
I’m excited to try Stable Diffusion once I get some spare time. From what I’ve seen, it’s outperforming everything else that’s out there, and it’s open source. The price point is a quarter of Dall-E 2’s, while offering the customization of Disco Diffusion, but with sliders and a nice UI.
I’ve already used Dall-E 2 for images in my IntuiFace experiences, where I needed an image to be twice as wide to work in a certain area, and I’ve let it fill in the other half of an empty space in a PNG photograph. As the guy from Two Minute Papers says, “What a time to be alive.”