ChatGPT or Shakespeare? Readers Couldn’t Tell the Difference—and Even Preferred A.I.-Generated Verse
A new study suggests people might like chatbot-produced poems for their simple and straightforward images, emotions and themes
If all the world’s a stage and all the men and women merely players, where does that leave non-human figures, like artificial intelligence chatbots? As it turns out, A.I. can hold its own against humans—even the Bard himself—when it comes to writing poetry.
A.I. chatbots can imitate famous poets like William Shakespeare well enough to fool many human readers, according to a new paper published Thursday in the journal Scientific Reports. In addition, many study participants actually preferred the chatbot’s poetry over the works of renowned writers.
Researchers asked OpenAI’s ChatGPT-3.5 to generate poems in the style of well-known authors, including Walt Whitman, Geoffrey Chaucer, T.S. Eliot, Sylvia Plath, Allen Ginsberg, Emily Dickinson and William Shakespeare.
Then, they gathered 1,634 study participants and asked them each to read ten poems—five written by a human poet, and five written by the chatbot in the style of that same human poet. The poet was randomly assigned to each participant.
When scientists asked participants to identify which poems were fake and which were real, the participants guessed correctly around 46 percent of the time—just a little bit worse than if they’d flipped a coin instead. This finding wasn’t necessarily a surprise, since ChatGPT-3.5 was likely trained on the works of the famous poets.
“Essentially, ChatGPT has displayed its skill as a quasi-plagiarist,” says Keith Holyoak, a cognitive psychologist at the University of California, Los Angeles, who was not involved with the study, to New Scientist’s Jeremy Hsu.
In a second experiment, researchers asked a different set of 696 participants to read and rate poems on 14 qualities, ranging from rhythm to originality. They told one-third of the participants they were reading poems written by an A.I. chatbot and another third that they were reading works written by a human. For the final third of participants, the scientists didn’t share anything about the poems’ authorship. In reality, participants in all three groups were given a mix of poems written by humans and by A.I.
As expected based on past research, the participants who believed they were reading poems written by humans gave higher ratings than participants who believed they were reading A.I.-generated poems—regardless of what they were actually reading.
But the team also uncovered a surprise: The participants who didn’t know anything about the poems’ origins gave higher ratings, on average, to those written by the chatbot.
Why do readers seem to prefer A.I.-generated poetry? It’s not entirely clear, but the researchers’ best guess is that the A.I. poems may be more appealing because they are relatively straightforward and simple to comprehend.
Because A.I.-generated poems cannot match the complexity of human-authored verse, they are better at “unambiguously communicating an image, a mood, an emotion or a theme to non-expert readers of poetry,” the researchers write in the paper.
For example, they write, the chatbot’s Plath-style poem is clearly about sadness:
Beyond the themes and emotions, ChatGPT’s poems were also simpler in terms of their overall structure and composition.
“Emily Dickinson sometimes breaks the expected rhyme scheme on purpose,” says study co-author Brian Porter, a researcher at the University of Pittsburgh, to New Scientist. “But the A.I.-generated poems generated in her style never did that once.”
Understanding poems written by humans requires deep, critical thinking—and that’s a big part of poetry’s appeal, the researchers write in the paper. But modern readers don’t seem to want to do this labor, instead preferring texts that give them “instant answers,” as Andrew Dean, a literary scholar at Deakin University in Australia who was not involved with the study, writes in the Conversation.
“When readers say they prefer A.I. poetry, then, they would seem to be registering their frustration when faced with writing that does not yield to their attention,” he adds.
In some instances, participants might have misunderstood the complexity of human poetry as A.I. incoherence. In other words, they could have been so confused by the genuine, human-authored work that they convinced themselves it must be garbled chatbot nonsense.
This theory seems to be supported by participants’ responses to T.S. Eliot’s “The Boston Evening Transcript,” reports the Washington Post’s Carolyn Y. Johnson. The poem, a satire about the readers of a once-popular newspaper, was the work most frequently misidentified as A.I.-generated. After reading Eliot’s words, one participant even wrote, in all caps, “IT DIDNT MAKE SENSE TO ME OR COME FROM SOMEONE THAT HAS FEELINGS.”
The study’s findings seem to confirm many onlookers’ biggest fears about A.I., which is that they’ll one day replace human artists and put them out of work. But Dorothea Lasky, the only living poet whose writings were included in the experiments, says it’s not necessarily a bad thing that readers enjoyed the A.I.-generated poems.
“Poetry will always be necessary,” Lasky tells the Washington Post. “If these people in the study read A.I. poems and liked that poem better than a human-generated poem, then that, to me, is beautiful. They had a good experience with a poem, and I don’t care who wrote it. I feel there is room for all poets—even robot poets.”