Eleven Labs, a Generative voice AI company that let's you clone voices (or create new ones) in over 29 languages, released their new "Dubbing Studio", which allows you to upload a video, detect the speakers and then translate (dub) it into 29 languages using the speaker's own voice.
In this article, I'll show you how to use it (it's surprisingly easy and cheap!)
Get an Account at Eleven Labs
The very first thing you want to do, if you don't already have one, is register for your free account at Eleven Labs. They have a free version, which will let you play with all of their features, but will limit you how much you can create. The interesting thing is that you can upgrade to a premium account for only $5 per month, which gives you enough credits to really use it professionally. You can get more details on their pricing structure (which may have changed since I published this article) here.
Let's Create a Dub!
After you logged in, on the left hand side of the navigation, you will find "Dubbing" (under Speech and Voices). Click on that.
This will refresh the Dubbing Studio screen (shown below).
Now let's create our first dub by following the instructions below.
Dubbing Project Name: Type in the name of your project (e.g. My Interview with John)
Source Language: Select the language that the video was recorded in. You can also let it auto detect it. I recommend you just select it.
Target Language: You can select as many languages as you want to dub it to. You will get multiple videos, one for each language. Keep in mind though the cost of doing this (see #7 below)
Select Source: You can either upload your video to them or select links from YouTube, TikTok, Twitter, Vimeo or other URLs. Keep in mind that for now, they limit you to a 45 minute video and/or 100MB file size. If it's larger or longer, you will have to break the video up into multiples.
Advanced Settings: I cover that below.
Create Button: Click on this when you are ready to submit it for dubbing.
Cost: Eleven labs will tell you how many characters this will cost you. For my 4 minute 25 second video, it cost me 8839 characters to dub it to one language (Spanish). Your subscription will give you a certain amount of characters to use per month, then you have to wait until it refills or purchase more.
Projects: These are the projects you already submitted and are completed. From there you can view the video, download it or remove it.
Before you click on "Create" (#6), let's click on the "Advanced Settings" (#5) to set up some more options.
The advanced settings gives you a bit more control over how the dubbing is done.
Number of Speakers: You can allow it to detect it automatically, but if you tell it how many there are, you may get a better dub out of it.
Video Resolution: This is the resolution you want your output video to be at. To me, I'd set it to the lowest, since in reality, you will only be using the audio portion of this with your current video.
Extract Time Range: If you don't want to dub the entire video, you can give it a start and end time (hh:mm:ss).
Add Watermark: If you allow Eleven Labs to add a watermark to your video, then you get a discount on the charge. To me, that's ok, since I'll be using mainly the audio (as secondary language) on my main video.
Now that you made your selections in the Advanced section, it's time to click on that "Create" button. It will go into the queue and will tell you how long it will take to dub it.
At this point, you can go on to your other projects and check back later.
Here is a short sample of an interview I did (in English) with the Sound Designer of the Netflix Movie, Society of the Snow, and how Eleven Labs dubbed it into Spanish using our own voices.
For a version 1.0 release, they are offering a very powerful dubbing solution. Having said that, I'd love to see more updates including:
Ability to edit the dub (correct any mistakes by typing the actual word and regenerating it).
Exporting a transcript.
Longer time for videos.
Larger file size.
Lip syncing (in the style of HeyGen, which I wrote about here).
What's the future hold?
This technology brings up many questions such as:
Can we use it for ADR?
How will it affect jobs?
How do you track permission of the artist to allow for this automation?
If you use it for ADR, how do you pay the artist for it?
I'm sure there are tons more questions that I didn't ask. All I know is that this is the future and we need to figure out a way to make it work for everyone.
Until the next article!