Adobe Voco was supposed to be an audio editing and generating software. It was dubbed “Photoshop for voice,” but it never actually came to fruition, due to ethical and legal concerns.
However, other alternatives do exist. If you are looking for an Adobe Voco alternative, you have come to the right place.
Today, I will go over the top 10 alternatives to Adobe Voco.
What Is Adobe Voco? And Why Was It Scrapped?
Adobe Voco was supposed to be an audio editing software. Like Photoshop, which allows you to edit pictures, Adobe Voco was supposed to allow you to edit audio recordings.
Adobe Voco would take the speech of someone and clone it to allow you to change their words and even add new phrases.
However, various legal concerns stopped the software from ever being released. Although the prototype was first unveiled a number of years ago, Adobe later decided not to actually release the software.
Legal and ethical reasons were involved. Adobe Voco would have allowed people to alter audio recordings to include phrases and sentences that were not actually uttered by the speaker.
For example, according to the BBC, at the demo unveiling of Adobe Voco in 2016, the phrase “and I kissed my dogs and my wife” was altered to “and I kissed Jordan three times”.
Adobe was worried that Voco could be used to manipulate media and that it might lead to mistrust in journalism. There was also a concern that it could prevent lawyers and other professionals from using digital media as evidence.
Other problems can arise from voice editing as well. For example, voice recognition is often used to confirm someone’s identity, but if you can create a synthetic voice or transform your voice into someone else’s voice, it paves the way for hacking.
In addition, Adobe was worried it might open them up to liabilities if Adobe Voco was used in a harmful or illegal manner.
However, although Adobe Voco was never officially released, other software that does similar things exist. The fact that Adobe Voco was never released paved the way for these alternatives, though not many are available due to the same reasons Adobe Voco never came into existence.
These software programs are what I am going to focus on in this article.
Best Adobe Voco Alternatives
1. Resemble AI
Resemble AI is a great alternative to Adobe Voco. It was actually designed for people to build their own synthetic voices for purposes such as voice overs and automated customer service.
To build your own voice, upload an existing recording or record 50 new samples.
However, if you have an existing audio recording of someone else’s voice, you can upload that too to clone the voice and create edited recordings. For legal and ethical reasons, you must have consent from the person whose voice you are cloning or editing.
Once you have built a voice (you will need at least five minutes of data), you can use text to speech to create synthetic audio.
There is even a mobile app that will allow you to create synthetic audio using text to speech on the go. You can do work on the app without an active internet connection.
With Speech Gradients, you can create many iterations of the same voice until you land on the one that has the right tone of voice and emotional slant. You can even clone your voice and create synthetic audio in different languages, including Spanish, English, Dutch, German, French, and Italian.
This has many implications. If someone made a recording of instructions in English, for example, simply upload it to convert the instructions, in their same voice, to German, Spanish, or another language.
This can be used for many reasons. Here are some ideas:
- Call centers can use them to create neural text to speech recordings in different languages
- Movie directors and producers can release the same movie in different languages, in the voices of the original actors, by using synthetic audio of their voices in other languages
- Provide training to your employees who live in different countries
- Create audio and video for your social media channels in different languages
- Create advertisements in different languages, without even speaking those languages
The Resemble AI Fill tool allows you to edit existing audio by removing certain words and replacing them with new words. You can try a quick demo of this software on this page.
Resemble AI also offers a number of services for businesses, including:
- Script writing
- Data collection
- Film and animation
- Advertising support
The best part is that plans start at just $30/month for those who record voices on the platform. This plan is for up to 100,000 characters a month (this is around one hour of generated synthetic audio); after that, it will cost $0.0005 per character.
If you want to upload your existing recordings, you will have to apply for a custom plan. You can always request a demo before starting.
Add ons are also available. On the standard plan, you can add another voice for $30/month per voice.
Full service voice AI is also available, and if you wish, you can take advantage of the Voice Talent service to be matched with voice talent for your project.
Check updated pricing information on the Resemble AI pricing page, as prices are subject to change.
2. Real Time Voice Cloning
Real Time Voice Cloning is a software that you can use to clone voices. You can either clone your own voice or clone other voices.
You only need to speak for five seconds for the software to recognize your voice. Then, it will be able to create audios that sound like your voice.
The software works by implementing Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis with a vocoder.
With just a few seconds of audio, the software can create a numeral representation of a voice. The result is that it can synthesize speech that sounds surprisingly similar to the original voice, even though the original person never uttered those words.
You can watch this YouTube video to see the software in action with different voices. The video also includes instructions on how to use the Real Time Voice Cloning software.
The software is available for download on Github. As mentioned, you can either record your own voice or upload an audio file of someone else’s voice, though having three audio files of a voice is best.
However, the software is not being maintained and can be difficult to set up. As such, it is best if you have some technical knowledge before using it.
For updates on the maintenance of the software, go to this thread.
3. Deep Voice
Baidu Research’s Deep Voice project is a speech synthesis and cloning software, an open source implementation of which can be downloaded from Github.
The Deep Voice project focuses on teaching machines (AI) how to clone voices and sound more natural with just a few voice samples. They have used it to clone single voices and multiple voices from just a few utterances.
You can learn more about the Deep Voice project on the Baidu Research blog. For an open source implementation of Deep Voice 3, go to Github (for both multiple speaker and single speaker cloning and text to speech).
4. WaveNet/Tacatron 2
WaveNet is a model from AI giant DeepMind that claims to be able to mimic any human voice and sound more natural than the best text to speech systems.
Tacatron 2 is based on WaveNet. An open source implementation of the WaveNet vocoder is available here on Github, and the Tacotron-2 Tensorflow implementation is available here on Github.
5. Vocaloid
Vocaloid is a singing synthesizer. It is a bit different from what Adobe Voco was intended to be, but you can use it to synthesize and edit singing audio.
Vocaloid Standard, which currently costs $191.44 plus tax, gives you four voice banks (in English and Japanese), while Vocaloid Premium, which costs $306.31, gives you eight voice banks.
You will also get more than 1,000 vocal phrases, 1,000 audio samples, 100 preset singing styles, and 11 audio effects.
Just enter your lyrics (or choose from the preset phrases) and choose a melody to create a synthetic vocal and song.
In addition to editing lyrics and creating vocals by using the existing phrases, you can adjust the emotional tone of voice of your vocal.
Again, Vocaloid is only a good alternative to Adobe Voco if you were looking for a software that you can use to create and edit songs.
Always check the Vocaloid homepage for updated pricing information.
6. Descript
Overdub Descript is an “ultra realistic” voice cloning software that is a great alternative to Adobe Voco for those who were frustrated after waiting for the release of Adobe Voco and never seeing it happen.
Descript is a video and audio editing software that works like a document. In addition to the screen recorder, it includes a text to speech software (this is the Overdub part), remote recording, transcription, and other tools.
A great reason to use Descript is for correcting your own recordings. If you made a recording but misspoke, you can simply retype it and your recording will be fixed.
You can also use Descript to remove filler words or create several versions of your recording.
When you fix your recording, Overdub Descript will automatically make the new words match the tone of voice of your entire sentence. This way, it won’t sound weird or too synthetic.
You also get the option of switching between different voice styles.
There are several stock voices built into Overdub Descript that you can use to create voiceovers with voices that are not your own.
Descript is built on the Lyrebird AI.
One thing you should know about Overdub Descript is that you can only clone your own voice. The reason it is this way is because they want to avoid the ethical and legal problems that can arise from being able to clone someone else’s voice.
However, as mentioned, if you would like to create synthetic audio using a voice that is not your own, you can use one of the built in stock voices instead.
Here is a summary of the different tools, services, and features of Descript:
- Screen recording: As you record, the software will automatically transcribe your words.
- Professional Transcription: Professional transcription services are available for just $2 a minute.
- Remote recording: Record Zoom, Skype, and other calls and have them automatically transcribed as they are recorded. You can use remote recording to create, publish, and share podcasts, and you can remove filler words like ah and um while you are at it. Create video visualizations of your podcast to publish on YouTube and other sites. These video visualizations are also called audiograms.
- Live collaboration: When working on podcasts, you can collaborate with others and work on the same text in real time. All changes will be synced to the cloud for everyone to access, but you can always go back to a previous version. People working on a project can add comments and tag others. You can edit recordings, add fades and other effects, and combine multiple tracks into a single transcript.
- Video editing: In addition to editing audio files, you can edit videos. You can add elements and shapes, zoom in, create animations, insert transitions between scenes and sections, add captions and title cards, and more. When editing your videos, you can add a voiceover by writing text (you can edit your voiceovers and video audio by simply editing the text).
- Overdub: This is the cloned text to speech software. Create a synthetic version of your voice that you can control by simply typing and editing text.
- Filler word removal: When editing recordings, you can remove all filler words at once with a single click. This will make you sound more confident and authoritative. It is hard to avoid saying filler words – we are only human – but Descript makes it easy to cut them out quickly.
- Publish: Once you have created and edited your audio file, you will get a URL that you can use to share it with others. You can also embed your audio files or videos directly on your website, such as in a blog or in a help section.
- Subtitles: When creating videos, have them automatically transcribed and subtitled for users to enjoy. Choose between different fonts, colors, and styling.
The best part about Descript is that there is a free plan. It is free forever, but it is only for screen recording and limited editing, and you can only do 20 screen recordings and create up to three hours of transcription (this is three hours per lifetime, not per month).
Other plans are:
- Creator: This plan costs $15/month ($12/month when billed yearly). It gives you 10 hours of transcription per month and unlimited projects and screen recordings.
- Pro: Pro costs $30/month (or $24/month when billed yearly). It includes the other services I mentioned, like Overdub (cloning and text to speech), publishing, filler word removal, and more. It also gives you 30 hours of transcription a month.
- Enterprise: This is a custom plan. You will get perks like a dedicated account representative, invoicing, onboarding and training, and more.
You can get a seven day trial of Descript Pro. Always check the Descript pricing page for updated pricing information.
Descript is based on Lyrebird AI. If you see Lyrebird mentioned on the internet as an alternative to Adobe Voco, be aware that Lyrebird is actually now part of Describe.
7. CereVoice Me
CereVoice Me is a text to speech voice cloning software. It allows you to create a text to speech audio file that sounds exactly like your own voice.
The good part about CereVoice Me, which was created by CereProc, is that it is web based. This means that unlike many other Adobe Voco alternatives that were mentioned in this list, you do not have to download software to use CereVoice Me; instead, you can use it from any computer or device.
Also, CereVoice Me is available in Spanish, French, Swedish, Italian, and Romanian, in addition to English.
When you buy CereVoice Me, CereProc will actually send you a special microphone to record your voice. You can keep this microphone, as it is included in the price.
You will need a quiet area to record your voice so CereVoice Me can build it. You will read a script into the microphone that you will be sent.
You can only record and build one voice with CereVoice Me. Perhaps the reason for this is so that you do not use it to clone other people’s voices.
CereVoice Me, however, is quite expensive. It comes with a one time fee of £499.99, and that only includes one voice.
Always check the CereVoice Me pricing page for updated pricing information.
CereProc also offers a voice creation service. This can be from your voice, or it can be a custom voice.
You can use this voice creation service to create a voice for voice overs and customer service instructions.
There are also various voices available for sale on the CereProc website, which you can buy on demand.
8. Replica Studios
Replica Studios captures your own voice through recordings, including your speech patterns and emotional tone of voice, and uses that to create voice overs. You can read one of their prewritten scripts to record your voice, or you can simply upload existing recordings of your voice so they can clone it.
Replica Studios will need around an hour’s worth of your voice recordings in order to create a voice clone. Once the clone is created, use text to speech to create synthetic vocals.
You can also choose from a variety of voices that exist in the library and edit them by changing the voice styles. For example, you can make a happy narration or an angry narration.
Although you are not limited to cloning your own voice, you will need consent from anyone whose voice you are cloning. Whether it is a friend or a talent you hired, you will need explicit permission to avoid legal and ethical problems.
9. iSpeech
iSpeech’s cloning technology allows you to personalize your character in a game or make your website speak with your own voice. It can be used to create audiobooks, advertising material, or video voice overs.
iSpeech has a number of services, some of which are rather fun. For example, there is iSpeech Obama, which is a clone of Obama’s voice and which comes with mobile apps.
You can use iSpeech Obama to create a vocal of Obama saying anything you want. You can use iSpeech Bush for the same purpose.
10. Respeecher
Respeecher is a tool that allows you to transform your voice into anyone else’s voice. For example, you can have an adult’s speech be transformed into a child’s speech, or you can say a sentence and have it transformed into the president’s voice.
The Respeecher software will pick up on emotions and your tone of voice and transfer that to the new vocal they create.
Voice replication is useful not only for filmmakers and other professionals but also for people who want to use it for fun, personal reasons.
Wrapping It Up: What Is The Best Adobe Voco Alternative?
The best Adobe Voco alternative is Resemble AI. There are some other great alternatives out there as well, but not many exist.
Adobe Voco was never released for a reason. While voice cloning has many great applications and can make people’s lives easier, it also paves the path for manipulation and bad intentions.
That is why it is important to use a voice cloning software that takes ethics seriously, such as Resemble AI. Although Resemble AI does not restrict you to cloning your own voice, it does require you to abstain consent from someone whose voice you are cloning.