News
Technology
OpenAI says its Voice Engine can mimic speaker with single 15-second audio sample

OpenAI says its Voice Engine can mimic speaker with single 15-second audio sample

OpenAI's new Voice Engine is capable of generating natural-sounding speech that closely resembles the original speaker. The model has serious risks, which are also acknowledged by the company.

Written By: Om Gupta

Published: March 30, 2024 8:37 IST, Updated: March 30, 2024 8:37 IST

New Delhi:

OpenAI is further improving its text-to-speech API with a new tool. The company recently conducted a small-scale preview of a new tool called Voice Engine. It is a voice cloning technology that just needs a 15 second audio sample to mimic any speaker. It generates “natural-sounding speech” with “emotive and realistic voices” as per the company claim.

The Voice Engine technology is under development from 2022 and is based on OpenAI’s pre-existing text-to-speech API. The company already uses one of the versions of this tool in its text-to-speech API and the Read Aloud feature. It powers preset voices in these features.

As per the company’s claim, the technology will be helpful in reading assistance, language translation, and for those who suffer from sudden or degenerative speech conditions.

However, this technology has its own potential risk. It can certainly be used by bad actors for fraud and scams and more such activities, which is already a problem. The company is also aware of these risks. In a blog post, the company wrote, “We recognize that generating speech that resembles people's voices has serious risks, which are especially top of mind in an election year”.

The company has stated that it is taking feedback from various partners, including those from the US and international government, media, entertainment, education, civil society, and others to minimize the risk involved in launching its product. All preview testers have agreed to OpenAI's usage policies, which prohibit the impersonation of an individual without their consent or proper legal right, as per the company.

In addition to this, the company has asked testers to disclose to their audience the voices are AI-generated. It has also implemented some safety measures such as watermarking “to trace the origin of any audio generated by Voice Engine” and is “proactive monitoring” its usage.

OpenAI hasn't said anything on when the product will roll out but as per the company, there will be a list of no-go voices to detect and prevent the creation of voices, which are similar to prominent figures.

ALSO READ: Instagram working on new feature to recommend Reels based on your and your friend's interest: Details here

Read all the Breaking News Live on indiatvnews.com and Get Latest English News & Updates from Technology

OpenAI

Follow IndiaTV on WhatsApp

OpenAI says its Voice Engine can mimic speaker with single 15-second audio sample

OpenAI's new Voice Engine is capable of generating natural-sounding speech that closely resembles the original speaker. The model has serious risks, which are also acknowledged by the company.

Top News

Bangladesh govt appeals for restraint amid violence, Osman Hadi to be buried today | 10 Points

National Herald case: ED challenges Sonia-Rahul Gandhi acquittal, appeals in Delhi High Court

Gujarat publishes draft voter list after SIR drive, over 73 lakh names deleted

India clinch another consecutive T20I series to end 2025 as T20 World Cup defence nears