Free AI voice generator on your PC: clone any voice with Voicebox

13 May 2026 17:37 96,930 views

Learn how to clone any voice on your PC in under a minute using Voicebox, a free and open-source AI voice generator. This step-by-step guide shows you how to create voice profiles, generate speech, build multi-speaker stories, and tweak advanced settings for the best results.

Want an AI voice that sounds just like you (or your narrator) running directly on your PC, with no subscriptions and no complicated setup? With a free, open-source app called Voicebox, you can clone a voice in under 30 seconds and start generating natural-sounding speech in just a few clicks.

Below is a full walkthrough of how to install Voicebox, create voice profiles, generate speech, and use its more advanced features.

What Is Voicebox?

Voicebox is a free, open-source AI voice generator that runs locally on your PC. It uses high-quality text-to-speech and voice cloning models behind the scenes, including Qwen 3 and Chatterbox, to turn short audio samples into realistic voice profiles.

Once you’ve created a voice profile, you can type any text and have it spoken in that cloned voice, export the audio, and even build multi-speaker stories and dialogues.

How to Install Voicebox on Your PC

Step 1: Download Voicebox

Go to the Voicebox website.
Click the download button.
Select your operating system (Windows, macOS, or Linux).

Step 2: Run the Installer

Open the installer you downloaded.
Follow the on-screen steps to complete the installation.

After installation, launch Voicebox. You’ll land on the main interface, and everything runs locally—no cloud setup or online account required.

Create Your First Cloned Voice

Voicebox lets you create a voice profile from three types of input:

Recording your own voice
Uploading an audio file
Capturing system audio

Option 1: Record Your Own Voice

1. Open the Voices section

On the left-hand side of the app, click the voices icon.
In the center, click the large Create voice button.

2. Record a sample

Select the option to record your voice.
Press record and speak clearly into your microphone.
You can record up to 30 seconds. Aim for 20–30 seconds for best quality.

3. Transcribe the audio

When you’re done, click Transcribe.
On first use, Voicebox will download a transcription model.
After that, you’ll see a text transcript of what you said.

4. Name and create the voice profile

On the right-hand side, give your voice a name (for example, "Kevin").
Click Create profile to save it.

Option 2: Upload an Existing Audio File

If you already have a recording you want to clone, you can upload it instead of recording live.

Click Create voice again.
Choose the Upload audio option.
Click Choose file and select your audio file.
Click Transcribe to process the audio and generate text.
Give the voice a name (for example, "Narrator").
Click Create profile to add it to your voice list.

Now you’ll see multiple voices available at the top of the interface, and you can switch between them anytime.

Generate AI Speech from Any Text

Once your voice profile is ready, you can turn any text into speech in just a few seconds.

1. Select your voice

At the top of the text-to-speech area, choose the voice you want to use (for example, "Kevin" or "Narrator").

2. Enter your text

Type the sentence or paragraph you want spoken. For example: "The Kevin Cookie Company has the best cookies in the world."

3. Choose the AI model

Select a TTS model. The Qwen 3 TTS model is recommended because it’s a newer, high-quality option.

4. Generate and preview

Click Generate.
After a short wait, you can play back the audio to hear the result.

If you’re not happy with how it sounds, you can:

Click to regenerate the audio.
Slightly tweak your input text (for example, change punctuation or wording).

5. Export your audio

Click the three dots in the top-right of the generated clip.
Choose Export to save the audio file to your computer.

This makes it easy to drop your cloned voice into videos, podcasts, or music projects. If you’re also interested in singing with an AI version of your own voice, check out this guide to making AI songs in your own voice with Suno.

Build Multi-Speaker Stories and Dialogues

Voicebox isn’t just a simple text-to-speech tool. It also includes a "Stories" feature that lets you create multi-line scripts with different speakers, then export everything as one audio file.

Create a New Story

On the left-hand side, click the Stories icon.
At the top, click Add new story.
Give your story a name (for example, "The Cookie Question").
Click Create.

Add Lines and Assign Speakers

In the bottom-left area, you can now build your script line by line:

Type the first line of dialogue or narration (for example, "Who makes the best cookies?").
Choose the speaker for that line (for example, your "Kevin" voice).
Click Generate to create the audio for that line.

Then add the next line:

Type the response (for example, "Why, of course, it’s the Kevin Cookie Company.").
Select a different speaker (for example, your "Narrator" voice).
Click Generate again.

At the bottom, you can play back the entire conversation and hear how the voices interact.

Edit the Timeline and Export

Voicebox gives you a simple timeline view for your story:

Drag items to move them around and adjust the order.
Fine-tune the timing between lines.
Shift the position of the entire story if needed.

When you’re happy with it:

Go to the top-right corner and click Export.
Voicebox will combine all lines and speakers into one continuous audio file.

This is especially useful for skits, audiobooks, tutorials, or character-based content. If you’re building a fully local creative workflow, you might also like this guide on running a local AI video generator on your own PC.

Effects, Devices, and Advanced Settings

Voice Effects

Voicebox includes a set of audio effects you can apply to change the character of a voice:

Robotic – for a synthetic, machine-like tone.
Radio – to mimic a broadcast or walkie-talkie style.
Echo chamber – adds space and echo.
Deep voice – lowers the tone for a more dramatic sound.

You can also create your own custom effects if you want a more unique sound profile.

Audio Devices

In the settings area, you can choose which hardware Voicebox uses:

Select your preferred microphone for recording samples.
Pick which speakers or headphones you want for playback.

Models and Performance

Voicebox lets you see and manage which AI models are used for generation. You can also tweak performance settings:

Go to Advanced settings.
Under processing options, switch from CPU to GPU if you have a dedicated graphics card.

Using a GPU can significantly speed up generation times and make the whole experience smoother, especially for longer scripts or more complex stories.

Tips for the Best Voice Cloning Results

To get the most natural and accurate cloned voice, keep these best practices in mind:

Use clean audio: Record in a quiet environment with minimal background noise.
Speak clearly: Avoid mumbling and keep a steady pace.
Go longer when possible: Aim for 20–30 seconds of audio. Shorter clips can work, but longer samples usually produce better results.
Regenerate if needed: If the output sounds off, try generating again or slightly adjusting your text.

Most importantly, always make sure you have permission to use and clone any voice you work with. Stick to your own voice or voices you’re explicitly allowed to use.

With Voicebox set up on your PC, you now have a powerful, free AI voice generator that runs locally, gives you full control, and lets you create everything from simple voiceovers to multi-character audio stories.

Tags: Voice Cloning Local AI Text to Speech

Comments

Amanda Taylor May 18, 2026

The multi-speaker story feature is genius. I created a short play with three characters, all cloned from different voices I recorded. The timeline editing is a bit basic but gets the job done. I'd love to see more advanced audio editing in future updates, like the ability to adjust pitch or speed per line.