Table of Contents

Documentation

Learn how to use every feature of ClonyVoice.

Getting Started

Installation

Download the installer from the link provided after purchase. Run the installer and follow the on-screen instructions. The installer will automatically set up all required dependencies including the AI models.

  1. Download the installer from your account or the email you received after purchase
  2. Run the .exe installer and follow the setup wizard
  3. Wait for the installation to complete (this may take a few minutes as AI models are set up)
  4. Launch ClonyVoice from your desktop or Start menu
Screenshot coming soon

License Activation

On first launch, you'll be asked to enter your license key. You can find it in your account dashboard or in the confirmation email. Enter the key and click Activate. Your license supports up to 2 machines simultaneously.

Screenshot coming soon

Interface Overview

The main interface is divided into several areas: the voice selector on the left, the text input area in the center, and the controls panel on the right. The top bar provides access to settings, voice management, and the Voice Store.

Video tutorial coming soon

Voice Cloning

How Voice Cloning Works

Voice cloning creates a digital model of a real voice from audio samples. Once cloned, this voice model can speak any text in any of the 10 supported languages.

Quick Mode

Quick mode creates an instant voice clone from your audio sample. Perfect for testing and previewing voices.

  1. Click "Clone Voice" in the main interface
  2. Select "Quick Mode"
  3. Upload or record an audio sample (3-60 seconds)
  4. Give your voice a name and click "Clone"
  5. Your cloned voice appears in the voice selector, ready to use
Video tutorial coming soon

Precise Mode

Precise mode uses transcription to align the audio sample with its text content, producing a higher-quality voice clone. Recommended for production use.

  1. Click "Clone Voice" and select "Precise Mode"
  2. Upload or record an audio sample (10-60 seconds recommended)
  3. The transcription is generated automatically — you can edit it for accuracy
  4. Click "Clone" and wait for processing (30-60 seconds)
  5. Your high-fidelity voice model is ready
Video tutorial coming soon

Multi-Sample Cloning

For the best voice fidelity, combine up to 5 different audio samples of the same voice. The AI will learn from all samples to create a more accurate voice model.

Screenshot coming soon

Tips for Best Results

docs_vc_tips_text

  • Use a quiet environment with minimal background noise
  • Record at a consistent volume and distance from the microphone
  • Include varied intonation — don't read in a monotone
  • Longer samples (10-30 seconds) give better results than very short ones
  • Avoid samples with music, other speakers, or sound effects

Voice Design

Creating Voices from Text

Voice Design lets you create entirely new voices by describing them in natural language. No audio sample is needed.

  1. Click "Design Voice" in the main interface
  2. Enter a description of the voice you want (e.g., "A warm, deep male voice with a calm tone")
  3. Click "Generate" to create the voice
  4. Preview the generated voice and regenerate if needed
  5. Save the voice to your library when satisfied
Video tutorial coming soon

Description Tips

docs_vd_tips_text

  • Describe age, gender, pitch, and tone
  • Mention accents or speaking styles if desired
  • Be specific: "energetic young woman" works better than "nice voice"
  • Generate multiple variations and pick your favorite

Text-to-Speech

Generating Speech

Once you have a voice (cloned, designed, or from the Voice Store), you can generate speech from any text.

  1. Select a voice from the voice selector
  2. Type or paste your text in the input area
  3. Choose the output language
  4. Optionally select an emotion preset
  5. Click "Generate" and listen to the result
  6. Save the audio file (WAV format) to your computer
Video tutorial coming soon

Emotion Presets

Apply emotional presets to make the generated speech more expressive. Available emotions: Happy, Sad, Angry, Fearful, Disgusted, Surprised, Whisper. Each preset adjusts the pitch, speed, and intonation of the voice.

Screenshot coming soon

Multi-language Output

Any voice can speak in any of the 10 supported languages. Simply select the target language before generating. The voice characteristics are preserved while adapting pronunciation to the target language.

Voice Store

Browsing the Voice Store

The Voice Store is an online marketplace of voice models shared by the community. Browse, preview, and download voices for your projects.

  1. Click "Voice Store" in the top navigation bar
  2. Browse voices by category or use the search function
  3. Preview any voice by clicking the play button
  4. Click "Download" to add it to your local voice library
Video tutorial coming soon

Sharing Your Voices

Share your voice creations with the community. Export a voice model and upload it to the Voice Store. Other users can then download and use your voice.

Screenshot coming soon

Import & Export

Exporting Voice Models

Export your voice models for backup or sharing.

  1. Right-click a voice in the voice selector
  2. Select "Export Voice"
  3. Choose a save location and click "Export"
Screenshot coming soon

Importing Voice Models

Import voice models from files or other users.

  1. Click "Import Voice" in the voice management menu
  2. Select the voice model file
  3. The voice appears in your voice selector, ready to use
Screenshot coming soon

Settings

Application Settings

Access settings from the gear icon in the top bar. Available settings include:

  • GPU/CPU mode: Choose between NVIDIA GPU acceleration or CPU-only processing
  • Output format: Configure the default audio output settings
  • Interface language: Change the application language
  • Model management: Download or update AI models
Screenshot coming soon

Troubleshooting

Slow generation

If generation is slow, make sure you're using GPU mode (requires NVIDIA GPU with CUDA). Close other GPU-intensive applications. On CPU-only mode, generation is naturally slower.

Poor voice quality

Use Precise mode with clean audio samples (10-30 seconds). Minimize background noise. Multiple samples improve fidelity. Avoid samples with music or multiple speakers.

Application crashes or won't start

Verify system requirements (Windows 10/11, 16 GB RAM). Try running as administrator. Ensure your antivirus isn't blocking the application. If the issue persists, reinstall or contact support.

License activation issues

Make sure you're entering the correct license key from your account dashboard. Internet connection is required for activation only. If you've reached the machine limit, deactivate a machine from your dashboard.