Frequently Asked Questions

Everything you need to know about ClonyVoice.

General

ClonyVoice is a desktop application for Windows that lets you clone voices, design new ones from text descriptions, and generate speech in 10 languages. It runs 100% locally on your computer — no internet required, no data sent to the cloud.

You provide a short audio sample (as little as 3 seconds) and ClonyVoice analyzes the voice characteristics — tone, pitch, accent, and timbre. The AI then creates a voice model that can speak any text in that voice. Two modes are available: Quick (instant results) and Precise (studio-grade quality with transcription).

Yes! After installation, voice cloning and speech generation run 100% locally — no internet is needed for audio processing, and your data never leaves your computer. A brief internet connection is required periodically for license validation only.

10 languages are built-in: English, French, German, Spanish, Italian, Portuguese, Russian, Japanese, Korean, and Chinese. The cloned voice can speak in any of these languages, regardless of the original sample language.

Absolutely. Everything runs locally on your machine. No audio samples, voice models, or generated speech are ever uploaded to any server. Your data stays on your computer.

Voice Cloning

In Quick mode, cloning is nearly instant (a few seconds). In Precise mode, it takes 30-60 seconds depending on the sample length and your hardware. The result is a reusable voice model you can use unlimited times.

Clear speech without background noise gives the best results. A quiet room recording with a decent microphone is ideal. The sample should contain only the voice you want to clone — avoid music or multiple speakers.

You can combine up to 5 audio samples for a single voice clone. More samples generally means higher fidelity, as the AI has more data to learn the voice characteristics.

Quick mode creates a voice clone instantly from the raw audio. Precise mode uses transcription to align the audio with text, producing a more accurate and natural-sounding clone. Use Quick for fast previews, Precise for production-quality results.

Yes! A voice cloned from an English sample can speak French, German, Japanese, or any of the 10 supported languages. The AI preserves the voice characteristics while adapting to the target language.

Features

Voice Design lets you create entirely new voices by describing them in text. For example: "A warm, deep male voice with a slight British accent." The AI generates a unique voice matching your description — no audio sample needed.

The Voice Store is a community marketplace where users can browse, download, and share voice models. You can find pre-made voices for various use cases or share your own creations with the community.

ClonyVoice supports emotional presets including: Happy, Sad, Angry, Fearful, Disgusted, Surprised, and Whisper. You can apply these to any voice to add expressiveness to your generated speech.

Yes. You can export your voice models to share or back up, and import models from other users. Compatible formats include standard voice model files.

Technical

Windows 10 or 11, 64-bit, with at least 16 GB of RAM. Storage: approximately 5 GB for the application. For best performance, an NVIDIA GPU with CUDA support is recommended (RTX series or GTX 1060+). CPU-only mode is available but slower.

No, but it's recommended. With an NVIDIA GPU (CUDA), voice generation is 5-10x faster. CPU-only mode works on any modern Intel or AMD processor but takes longer per generation. AMD and Intel GPUs are not supported for acceleration.

After purchasing, you receive a license key and a download link. Run the installer, enter your license key on first launch, and you're ready to go. The installer handles all dependencies automatically.

ClonyVoice checks for updates on launch (when internet is available). Updates are free forever with your lifetime license. You can also download the latest version from your account dashboard.

Licensing & Commercial Use

Yes, commercial use is included with your license. You own full rights to any audio you generate. Just make sure you have permission for any real person's voice you clone.

Each license allows activation on up to 2 machines simultaneously. You can deactivate a machine from your account dashboard to free up a slot.

Yes! Your lifetime license includes all future updates at no extra cost. New features, new languages, and improvements are all included.

We offer a 14-day money-back guarantee on all purchases. If you are not satisfied for any reason, simply log in to your account and contact our support team for a full refund. No questions asked.

Troubleshooting

Try using Precise mode with a longer, cleaner audio sample (10-30 seconds). Make sure the original sample has minimal background noise. Using multiple samples also improves quality significantly.

If you're using CPU-only mode, generation is naturally slower. For faster results, use an NVIDIA GPU with CUDA support. Also, close other resource-intensive applications while generating audio.

Make sure you meet the minimum system requirements (Windows 10/11, 16 GB RAM). Try running the application as administrator. If the issue persists, reinstall the application or contact support.

Still have questions?

Contact our team and we'll get back to you as soon as possible.

Contact Us