For some time now, I've also been sharing visualizations of my poems here, created with the help of artificial intelligence (AI). This is a new approach for me. My own graphics have always been more like a frame or backdrop to stage a poem, with the poem itself being part of the image. Now, however, the content itself is transformed into an image.
Table of Contents
Readers are starting to ask: How does it work? The answer is surprisingly simple, and anyone can do it. All you need is an account with OpenAI, the company behind ChatGPT, the most well-known AI at the moment, and DALL-E, the AI for creating images. A subscription costs $20 a month. As an author of novels, non-fiction, and online texts, this tool is invaluable to me — there's never been a better text tool. However, you can also use both for free!
OpenAI has set up an interface (a GPT) that allows the two AIs to work together. All I really do is copy a poem into ChatGPT and ask it to visualize it. It creates what’s called a prompt — an instruction for the image AI. That’s all there is to it.
Bing offers access to DALL-E through the Microsoft Designer, which requires a Microsoft account. While you can edit the 100% JPG images it generates and the watermark disappears after download, Bing doesn't support interactive conversations like ChatGPT. Designer, integrated with Microsoft's Copilot and 365 suite, uses AI to generate and refine images based on textual prompts (up to 480 characters).
Image Formats
You can refine the image output by specifying the image format: portrait (1024x1792 pixels) or landscape (1792x1024 pixels). By default, DALL-E uses a square format, similar to what’s often seen on Instagram, which I also use here. You can influence the style, colors, and more. It’s even possible to make touch-ups now. The results come in the WebP format or JPG at Bing.
But all of this only partially interests me because I want to see what the AI makes of my poems. Sometimes it amazes me. Occasionally, the results are boring, too similar, or off-topic. That could be due to my poems… If in doubt, you just try again. If you like, you can also describe your desired image, which might lead to success.
But be careful with copyrights, DALL-E doesn’t let you do everything. The AI itself says so and provides an example: "I cannot create images in the style of Pablo Picasso, even if they are inspired by his style, due to our content guidelines. These guidelines restrict the creation of images in the style of artists whose latest works were created after 1912, and Picasso's work falls into this category. However, I can create an image that captures the essence of … with a general abstract and cubist approach that is not specific to any one artist. Would you like to proceed with this approach?"
Style Choices
You can dictate the style, tell it to be fantastic, realistic, naive, impressionistic, or whatever comes to mind. The AI has several styles up its sleeve. However, it will refuse to imitate artists whose work is still under copyright. But there are ways to get around that, more or less.
The results are not random, as you’ll immediately notice. These are not arbitrary, abstract compositions, but it feels like someone has genuinely considered the content. Yes, as if the AI understands what’s being conveyed.
Let's do a sample. Here is a well-known verse from Robert Frost (excerpt).
The Road not Taken
Two roads diverged in a yellow wood,
And sorry I could not travel both
And be one traveler, long I stood
And looked down one as far as I could
To where it bent in the undergrowth;
This was the prompt I used: Visualize in a simple stylized manner: Two roads diverged in a yellow wood ... etc.
The style of the image strongly resembles linocut or woodcut techniques, as it employs clear, high-contrast lines and sharp edges. The highly stylized trees and path are depicted with bold shading and light-dark contrasts, which are typical of these methods. Additionally, the image carries a symbolic and narrative quality, reminiscent of works that visually depict stories or meaningful choices.
The Limits of Poem Visualization
It's important to know that the outcome is always different. DALL-E never paints the same picture twice, and ChatGPT never writes the same text. Both are generative models; they don’t use pre-set patterns or databases but create everything live. The images can vary even with the same request and still stay on topic.
It’s not uncommon for the AI to include letters and pseudo-text in its images, which can render them unusable since they usually look terrible and incorrect, as no real language is involved. Telling the AI to stop doing that only partially works. Sometimes it just ignores the instruction. In that case, you have to start over but will get a guaranteed different result. Or, you use a tool like Photoshop. Depending on where the text appears, you can remove or erase it.
You may need to experiment a bit. Sometimes, providing additional information or context can help. You can also ask DALL-E how it would visualize a text or prompt. It will then present ideas. DALL·E: AI Prodigy Paints Pictures from Your Dreams
How Does It Work?
DALL-E is based on a large amount of image and text data with which it has been trained. This training enables the AI to understand the relationships between textual descriptions and visual elements. When a user enters a text prompt, DALL-E analyzes that instruction and generates an image that corresponds to the described content. By using algorithms, DALL-E can create a wide range of image styles and content, from realistic depictions to abstract artworks. For example: Realistic, Photorealistic, Surrealist, Impressionistic, Abstract, Digital, Pixel Art, Cartoon, Comic Style, Fantasy, Sci-Fi, Minimalist, Vintage, Retro, Hyperrealistic.
And how long does it take? Just seconds, really, but the files aren't optimized for loading times, which delays their output. Additionally, the provider sometimes struggles with capacity issues. If you place too many requests on the system, it refuses to cooperate, gives you scheduled times, or shows error messages. Then, you’ll have to wait.
Censorship and Restrictions
You can also fall victim to a modern-day censorship. Certain words are on a blacklist, even if they seem entirely harmless. This can be frustrating, as the AI censor is rather sensitive. You can trick or bypass it, but that takes time and several attempts, which you often don't have, as both OpenAI and Bing limit the number of daily generations.
ChatGPT doesn’t accept every text. There are censorship filters. Profanity, offensive terms, sexuality — none of that is allowed. Some text will be excluded. Annoyingly, the AI goes far beyond what’s necessary. The rejection behavior isn't always predictable; sometimes an explanation helps, and it works after all. Sometimes DALL-E just won’t cooperate. With Microsoft, there's no room for discussion — its filters are even stricter, and entirely harmless things can be banned. It's all about safety, diversity, and inclusion. I call it "peace, love, and harmony."
What About the Copyrights for the Images?
OpenAI says: "As with DALL·E 2, the images you create with DALL·E 3 are yours to use and you don't need our permission to reprint, sell or merchandise them."
Feel free to ask questions about the article or share your own experiences. AI is a fascinating, new field that is still being explored. I'm excited to see what we’ll discover.
Adjectives We Associate with AI Visualizations
surreal, futuristic, detailed, hyperrealistic, digital, distorted, algorithmic, creative, eerie, fascinating, dreamlike, complex, fantastic, generative, precise, experimental, flawed, innovative, abstract, dynamic, hypnotic, smooth, synthetic, unexpected