I’ve been playing with DALL·E 2, which is a new AI system that can create realistic images and art from a description in natural language (called prompts). This R&D project by OpenAI is quite amazing, and you can define various art styles for your imaginative output.
I put it to test to see how far I could push it to create a series of images related to a potential Intuiface business. Here are some fun examples with the corresponding prompts:
Isometric 3d illustration of an expo booth, with several displays, visitors using touch screens
I have been on the wait list for a couple weeks. Hoping I get in soon. This looks like a fun service/technology to mess around with. Especially since I am not a designer by any stretch of the imagination
Dall-E 2 does a really great job at photorealism, but I’ve also found it’s great for editing images. You can import a PNG, erase a section you don’t like, and Dall-E 2 will fill in the erased area. At 10 cents per 4 generations, it’s totally worth the cost of credits for this feature alone. Unlike other text-to-image generators, it handles objects in perspective a lot better, and seems to understand scale with the placement of objects at various distances.
I started off with Disco Diffusion, and it’s really fun, but it takes a lot longer to get generations. But you have custom control over dimensions and you can run queues that run for hours, which Dall-E 2 does not do. It’s also much more DIY in nature, in that you’ve got access to a lot of variables that you can tweak, and custom trained models you can import.
In between these in terms of ease-of-use versus control is MidJourney, which produces more painterly and concept art results than Dall-E 2, and out-of-box does a better job with people, faces, hands, etc. It’s ability to mimic different drawing and painting styles exceeds Dall-E 2 imho.
I’m excited to try Stable Diffusion once I get some spare time. From what I’ve seen, it’s outperforming everything else that’s out there, and it’s open source. The price point is a quarter of Dall-E 2’s, while offering the customization of Disco Diffusion, but with sliders and a nice UI.
I’ve already used Dall-E 2 for images in my IntuiFace experiences, where I needed an image to be twice as wide to work in a certain area, and I’ve let it fill in the other half of an empty space in a PNG photograph. As the guy from Two Minute Papers says, “What a time to be alive.”
As an example, here’s some concept art that I’m doing for a friend’s movie. These are the bedrooms of a couple of highschoolers that are main characters. Dall-E 2’s ability to handle perspective and lighting is quite impressive.
Midjourney V6 is a massive improvement over what was available in 2022 and can actually create the last example above fairly easily. Dall-E3 has made significant improvements as well. It’s crazy how fast the technology is moving. Now text2video is in it’s infancy and text2audio (music) is as well. None of it is as good as traditional methods, of course, but the quality of the output is increasing exponentially.