New AI Mimics Any Voice in a Matter of Minutes

The story starts out like a bad joke: Obama, Clinton and Trump walk into a bar, where they applauded a new startup based in Montreal, Canada called Lyrebird.

If the scenario seems too bizarre to be real, you’re right—it’s not. The entire recording was generated by a new AI with the ability to mimic natural conversation, at a rate much faster than any previous speech synthesizer.

Announced last week, Lyrebird’s program analyzes a single minute of voice recording and extracts a person’s “speech DNA” using machine learning. From there, it adds an extra layer of emotion or special intonation, until it nails a person’s voice, tone and accent—may it be Obama, Trump or even you.

While Lyrebird still retains a slight but noticeable robotic buzz characteristic of machine-generated speech, add some smartly-placed background noise to cover up the distortion, and the recordings could pass off as genuine to unsuspecting ears.

Creeped out? You’re not alone. In an era where Photoshopped images run wild and fake news swarms social media, a program that can make anyone say anything seems like a catalyst for more trouble.

Yet people are jumping on. According to Alexandre de Brébisson, a founder of the company and current PhD student at the University of Montreal, their website scored 100,000 visits on launch day, and the team has attracted the attention of “several famous investors.”

While machine-fabricated speech sounds like something straight out of a Black Mirror episode, speech synthesizers—like all technologies—aren’t inherently malicious.

For people with speech disabilities or paralysis, these programs give them a voice to communicate. For the blind, they provide a way to tap into the vast text-based resources on paper or online. AI-based personal assistants like Siri and Cortana rely on speech synthesizers to create a more natural interface with users, while audiobook companies may one day utilize the technology to automatically and cheaply generate products.

“We want to improve human-computer interfaces and create completely new applications for speech synthesis,” explains de Brébisson to Singularity Hub.

Lyrebird is only the latest push in a long line of research towards natural-sounding speech synthesizers.

The core goal of these programs is to transform text into speech in real time. It’s a two-pronged problem: for one, the AI needs to “understand” the different components of the text; for another, it has to generate appropriate sounds for the input text in a non cringe-inducing way.

Analyzing text may seem like a strange way to tackle speech, but much of our intonation for words, phrases and sentences is based on what the sentence says. For example, questions usually end with a rising pitch, and words like “read” are pronounced differently depending on their tense.

But of the two, generating the audio output is arguably the harder task. Older synthesizers rely on algorithms to produce individual sounds, resulting in the characteristic robotic voice.

These days, synthesizers generally start with a massive database of audio recordings by actual human beings, splicing together voice segments smoothly into new sentences. While the output sounds less robotic, for every new voice—switching from female to male, for example—the software needs a new dataset of voice snippets to draw upon.

Because the voice databases need to contain every possible word the device uses to communicate with its user (often in different intonations), they’re a huge pain to construct. And if there’s a word not in the database, the device stumbles.

By listening to voice recordings the AI learns the pronunciation of letters, phonemes and words. Like someone learning a new language, Lyrebird then uses its learned examples to extrapolate new words and sentences—even ones it’s never learned before—and add on top emotions such as anger, sympathy or stress.

At its core, Lyrebird is a multi-layer artificial neural network, a type of software that loosely mimics the human brain. Like their biological counterparts, artificial networks “learn” through example, tweaking the connections between each “neuron” until the network generates the correct output. Think of it as tuning a guitar.

Similar to other deep learning technologies, the initial training requires hours of voice recordings and many iterations. But once trained on one person’s voice, the AI can produce a passable mimic of another voice at thousands of sentences per second—using just a single minute of a new recording.

That’s because different voices share a lot of similar information that is already “stored” within the artificial network, explains de Brébisson. So it doesn’t need many new examples to pick up on the intricacies of another person’s speaking voice—his or her voice “DNA,” so to speak.

Although the generated recordings still have an uncanny valley quality, de Brébisson stresses that it’ll likely go away with more training examples.

“Sometimes we can hear a little bit of noise in our samples, it’s because we trained our models on real-world data and the model is learning the background noise or microphone noise,” he says, adding that the company is working hard to remove these artifacts.

Adding little “extra” sounds like lip smacking or intaking a breath could also add to the veracity of machine speak.

These “flaws” actually carry meaning and are picked up by the listener, says speech researcher Dr. Timo Baumann at Carnegie Mellon University, who is not involved with Lyrebird.

But both de Brébisson and Baumann agree that the hurdles are simple. Machines will be able to convincingly copy a human voice in real-time in just a few years, they say.

De Brébisson acknowledges that mimicking someone else’s voice can be highly problematic.

Fake news is the least of it. AI-generated voice recordings could be used for impersonation, raising security and privacy concerns. Voice-based security systems would no longer be safe.

While Lyrebird is working on a “voice print” that will easily tell apart originals and generated recordings, it’s unreasonable to expect people to look for such a mark in every recording they come across.

Then there are slightly less obvious concerns. Baumann points out that humans instinctively trust sources with a voice, especially if it’s endowed with emotion. Compared to an obvious synthetic voice, Lyrebird is much easier to connect with, like talking to an understanding friend. While these systems could help calm people down during a long wait on the phone, for example, they’re also great tools for social engineering.

People would more likely divulge personal information or buy things the AI recommends, says Baumann.

In a brief statement on their website, Lyrebird acknowledges these ethical concerns, but also stressed that ignoring the technology isn’t the way to go—rather, education and awareness is key, much like when Photoshop first came into the social consciousness.

“We hope that everyone will soon be aware that such technology exists and that copying the voice of someone else is possible,” they write, adding that “by releasing our technology publicly and making it available to anyone, we want to ensure that there will be no such risks.”

Lyrebird is too optimistic to completely discount the risks. Without doubt, fake audio clips are coming, and left unchecked, they could wreak havoc. But although people are still adapting to fake images, fake news and other construed information that warps our reality, the discussion about alternative facts has entered the societal mainstream, and forces have begun pushing back.

Like the delicate voice-mimicking songbird it’s named after, Lyrebird is a wonder—one that we’ll have to handle with thought and care.


Transcendent Man
Posted by Transcendent Man | 

In Letter, At Least 12 States Will Sue to Block Any Rollback of Emissions Standards

While the White House and Scott Pruitt, head of the EPA, have indicated their plan to roll back vehicle emissions standards set by the Obama administration in 2011, the attorneys general of 12 states and Washington District of Columbia have pledged to sue the EPA if the roll back happens. The states — California, Vermont, Connecticut, Rhode Island, Delaware, Pennsylvania, Iowa, Oregon, Maine, New York, Massachusetts, and Maryland — made their intentions clear in a letter to Pruitt. Back in 2011, President Obama’s administration made the deal with automakers, who agreed to work on doubling their average fuel efficiency fleet-wide until it reaches 54.5 miles per gallon by the year 2025. The parties also agreed to undergo mid-term evaluations no later than April 2018 to ens...

Transcendent Man
Posted by Transcendent Man | 

Could Tiny Fusion Rockets Revolutionize Spaceflight?

A small NASA-funded company is slimming down nuclear fusion reactors for space scienceVIA

Transcendent Man
Posted by Transcendent Man | 

Elon Musk’s Space X Falcon 9 Rocket Lands Sucessfully | True Hollywood Talk

The SpaceX Falcon 9 Rocket has landed successfully!  The Dragon, which is still on its way to orbit, is carrying around 6,000 pounds of supplies and science experiments for the crew of the ISS. That includes a group of fruit flies to test out how the cardiovascular system functions in microgravity, as well as a group of mice to study bone loss in the space environment. Some unique technologies are also riding up inside the Dragon’s trunk — the unpressurized structure attached to the spacecraft that provides support and houses the vehicle’s solar panels. The trunk contains an instrument called NICER, which will eventually be mounted to the outside of the space station to look for neutron stars, as well as a specialized solar panel called ROSA which can be unfurled a bit like a flag,...

Transcendent Man
Posted by Transcendent Man | 

The FDA Has Fast-Tracked A First-Of-Its Kind Cancer Treatment

The FDA has officially approved the first cancer drug that treats tumors based off of genetic information independent of the tumors' location in the body. In a clinical trial, the drug was active against cancer in over half of patients and completely wiped out the cancer in 21 percent of patients. One new drug has doctors and pharmaceutical companies in a tizzy. Pembrolizumab (branded Keytruda) has recently been approved, in a hurry, by the Food and Drug Administration (FDA) to treat multiple tumors that arise from cancer in individuals with the same genetic abnormality. During a clinical trail, the drug was tested in 86 patients. Of those who took part in the study, 66 patients had their tumors both significantly shrink and stabilize — meaning the tumors did n...

Transcendent Man
Posted by Transcendent Man |