What is the face of the man behind the apple? For almost 60 years, the figure wearing a sombre suit and bowler hat in René Magritte’s painting “The Son of Man” has been obscured by a polished green apple. His facial features were intended to remain a mystery, the fruit an artistic provocation. Today, using new technology, 23-year-old digital artist Josephine Miller can roll the apple away.
Miller tilts her laptop towards me in the hushed café of the British Library in London to show how she used Dall-E 2, software that generates images using artificial intelligence (AI), to remove the fruit. Behind it is a man who looks startled to be suddenly revealed, eyebrows raised and piercing blue eyes staring out over an expertly waxed moustache. The face is painted in Magritte’s somewhat flat style and signature palette, as if the two images were painted by the same hand, side by side.
It’s a neat trick. Then Miller shows me she has generated not one but 200 possible faces. Magritte, a trickster at heart, probably would have approved. The technology, which can create near-infinite artistic combinations in response to a few words or images, has enabled Miller to do work that would have either taken months with previous tools or might not have been possible at all. It is dizzying in both its capabilities and its ethical implications. I ask if she finds it overwhelming. “No,” she says immediately. “Well, maybe it is for some people, but I’m just excited.”
Dall-E 2 is just one of several AI image-generation tools that have become available to the public this year. Since the spring, the internet has experienced a Cambrian explosion of every conceivable application of the technology. The only thing more amazing than the tech itself is the wild leaps of imagination of its users: Nosferatu in RuPaul’s Drag Race, Da Vinci’s “The Last Supper” but the apostles are crowding round to take a selfie, the French Revolution as seen from the perspective of a helmet-mounted GoPro camera, a bottle of ranch dressing testifying in court. All of these can be produced in less than a minute without much technical expertise.
And the technology is advancing swiftly. Six months ago most tools struggled to create human faces, usually offering grotesque combinations of eyes, teeth and stray limbs; today you can ask for a “photorealistic version of Jafar from Disney’s Aladdin sunbathing on Hampstead Heath” and get almost exactly what you’re looking for.
All of which is to say this is a pivotal moment in the history of art. AI-generated imagery “is a major disruptive force, and there will be both democratic and oppressive aspects to it”, says British artist Matthew Stone, who used Dall-E 2 in the process of creating artworks for his latest exhibition. Millions of images swarm out of this Pandora’s Box every day and, with them, a number of difficult questions about plagiarism, authorship and labour. Perhaps the biggest of all: is this the end of human creativity?
One of the first things any evangelist will tell you about AI image generation is how easy it is to do. You describe an image using natural language, as you would when talking to another person, and the software serves up several results in a matter of seconds.
Midjourney, a Dall-E rival, offers a free trial accessible via the chat application Discord. Hearing that it excels at images that have a more painterly style, I decide to try and make illustrations for a children’s book I’m working on, about a cat adventuring around the Mediterranean seeking its missing owner. I type in the prompt for my first idea:
/IMAGINE: GINGER CAT AT THE TOP OF A MINARET IN ISTANBULThe image develops before my eyes like a photograph in a chemical bath, starting out as a blur and gradually gaining definition and coherence.
The first result is not great. The AI has given me a generic tower rather than a recognisable minaret. There is no sense that we are in Istanbul and, worst of all, the cat’s face is grotesquely embedded into the brickwork of the tower itself. This is my first lesson of AI image generation: although the pictures shared on social media often look fantastic, in-progress results can be terrible — ugly, generic or barely resembling an even simple prompt.
Since the free trial is located on a public chat server, my cat-minaret is quickly lost in a ceaseless flow of other people’s prompts and images. I watch what they are typing to try to glean some tips. It seems that the more detailed your prompt, the better the results. Several users keep returning to the same idea, tweaking words and phrasing to improve their results. One person keeps iterating on the idea of an “emotional support limpet” and, with each new version, the aquatic snail gets cuter.
I return to my cat prompt and add more detail:
/IMAGINE: GINGER CAT LOOKING WISTFULLY OVER A VIEW OF ISTANBUL FROM THE TOP OF A MINARET WHILE THE SUN SETS, ANIME STYLE
This generates a marked improvement — there’s a gorgeous contrast between rusty orange and deep indigo in the sky, with pointed minarets like needles scratching the rose-hued clouds. Yet the cat is still not right. In one version, it towers over the architecture like an adorable Godzilla. In another, it is normal sized but for some reason white, as if the sunset has leached out all of its colour.
I scrap the cat and go for something more artistic:
/IMAGINE: CARNIVAL CELEBRATION, BEAUTIFUL, GEORGES SEURATThis composition has a real sense of festivity, but the AI didn’t get the pointillist style I was hoping to draw from the “Seurat” reference. I try the same prompt with the word “pointillism” and strike gold, with a soft-hued abstraction of clown-like figures at a fairground. There is a clicky, game-like satisfaction to plucking a random sentence from your imagination and seeing how the AI deals with it, and I spend hours testing out all manner of prompts.