OpenAI can clone a voice with just 15 seconds of audio

OpenAI has just announced a voice cloning technology called Speech machine that can imitate any speaker by analyzing an audio sample of just 15 seconds. The company claims that it generates “natural language” with “emotional and realistic voices”.

This technology is based on the company’s existing text-to-speech API and has been in development since 2022. OpenAI already uses a version of the toolset to support the predefined voices available in the current Text-to-Speech API for the read-aloud feature.

OpenAI indicates that they find this technology useful Read texts to children in a familiar voice, translate between languages, or help people suffering from sudden or degenerative speech disorders.

Despite the potential benefits, this technology can be used to generate deepfakes, which is already a problem today. Voice Engine is not yet fully ready for release as there are serious privacy concerns that need to be addressed before full implementation.

OpenAI recognizes that this technology has done this “Significant risks that are particularly concerning in an election year.” The company says it takes feedback from “U.S. and international partners in government, media, entertainment, education, civil society and beyond.” to ensure that the product is brought to market with the lowest possible risk.

Everyone who tested the previous example agreed to OpenAI’s usage guidelines, which prohibit impersonating another person without consent or legal right.

Aside from that, Anyone who uses the technology must disclose to their audience that the voices are generated by artificial intelligence. OpenAI implemented security measures such as: B. Watermarks to track the origin of audio data “proactive monitoring” how the system is used. When the product officially launches, there will be one “List of disallowed votes” This detects and avoids AI-generated voices that are too similar to prominent personalities.

Read Also:  The story of 400 years was created in 2 minutes and 24 seconds

Voice Engine could cost $15 for a million characters, which is about 162,500 words. Marketing materials also refer to an “HD” version that costs twice as much, but the company hasn’t detailed how that will work.

Recent Articles

Related News

Leave A Reply

Please enter your comment!
Please enter your name here