Physical Address

304 North Cardinal St.
Dorchester Center, MA 02124

Can you tell an AI voice from a real human?

Scientists reveal difference between two voices is now becoming indistinguishable

People are unable to tell if a voice speaking to them is that of a real person or an AI clone, scientists have found.
AI audio cloning is now so advanced that it is able to create entire paragraphs from small snippets of recordings and make a voice that is indistinguishable from a real human.
Realistic computer voices have raised concerns from experts that they could be used by scammers imitating banks and also help make convincing deepfakes that spread fake news.
Data from a study of 100 people by University College London found that when people are played audio clips out loud, they are unable to tell which is human and which is AI.
People were played a phrase twice – once from a clone and once from a real recording of a person – and asked which one they thought was authentic.
Participants only answered correctly 48 per cent of the time.
People were better at recognising AI when it was impersonating someone they knew, with correct identification 88 per cent of the time when it was the voice of a friend.
Prof Carolyn McGettigan, the study’s author and chairman of speech and hearing sciences at UCL, is set to publish the findings in a journal and presented them at the British Science Festival.
“What we’ve found is that for people who know the original voice, they are actually quite sensitive to whether what they’re hearing is a clone or an authentic recording. But when it comes to a stranger’s voice, they’re basically guessing,” Prof McGettigan said.
“What we’re seeing now is that the technology is good enough to mean that listeners may be unable to tell if what they’re listening to is the voice of a real person or not.
A recording of Aesop’s fable, The North Wind and the Sun, was played aloud to journalists who were asked to replicate the study and say if the audio clip was real or fake.
Every person in the ad-hoc experiment believed the recording to be genuine, but Prof McGettigan revealed it was a chimaera, with excerpts of both AI and human interwoven into one clip.
“You were hard-pressed to think that this was a computer-generated voice in any part, and you probably wouldn’t have thought there were two different sources of speech in there,” Prof McGettigan said.
“Synthetic voices can sound very, very human-like. You were all pretty convinced that it was human all the way through.”
The technology is now so readily available and competent that people and companies are contemplating the idea of allowing people to use AI clones of a specific voice for smart assistants like Siri and Alexa, or to read audiobooks to individuals.
Prof McGettigan adds that there are serious ethical questions on how to deploy and regulate this technology and also how to protect people from deception, but also from a Black Mirror-style use of audio technology to recreate loved ones.
“This has serious ethical implications that we all need to consider – the technology already exists, so it’s up to us to decide how we best make use of it,” they said.
“I think it is realistic to say that any kind of technology will always be apt to be abused, regardless of its benefits.
“It seems like there are probably ways in which, as a whole society, we need to think about the ways in which we evaluate information.
“I think there are lots of possibilities for harm in these kinds of technologies that might seek to replicate a person’s identity if they were used for nefarious purposes, but I suppose the question is to what extent would minimising the harms also interrupt the potential benefit.”

en_USEnglish