API Documentation
ClonyVoice provides a local REST API that lets you integrate text-to-speech, voice cloning, and audio processing into your own applications. The API runs on your machine whenever ClonyVoice is open.
Base URL
The API is available locally at:
Authentication
All API requests require an API key passed via the X-API-Key HTTP header. Create keys in the ClonyVoice desktop app under the API tab.
Endpoints
Voices
| Method | Path | Description | Scope |
|---|---|---|---|
| GET | /api/voices | List all voices | voices:read |
| GET | /api/voices/{id} | Get voice details | voices:read |
| PUT | /api/voices/{id} | Update voice metadata | voices:write |
| DELETE | /api/voices/{id} | Delete voice | voices:write |
| GET | /api/categories | List categories | voices:read |
| POST | /api/categories | Create a category | voices:write |
| DELETE | /api/categories | Delete a category | voices:write |
| GET | /api/favorites | Get favorite voices | voices:read |
| POST | /api/favorites/{id} | Toggle favorite | voices:write |
| POST | /api/voices/export | Export voices (.clonyvoice) | voices:read |
| POST | /api/voices/import/preview | Preview import file | voices:write |
| POST | /api/voices/import | Import voices | voices:write |
Text-to-Speech (TTS)
| Method | Path | Description | Scope |
|---|---|---|---|
| POST | /api/generate/clone | Generate speech (cloned voice) | tts:generate |
| POST | /api/generate/preset | Generate speech (preset voice) | tts:generate |
| POST | /api/generate/clone-chunked | Chunked generation (per sentence) | tts:generate |
| POST | /api/generate/clone-chunked-multivoice | Multi-voice chunked generation | tts:generate |
| POST | /api/generate/clone-chunked-multilang | Multi-language chunked generation | tts:generate |
| POST | /api/generate/regenerate-chunk | Regenerate a single chunk | tts:generate |
| POST | /api/generate/merge-chunks | Merge chunks into final audio | tts:generate |
| POST | /api/generation/cancel | Cancel ongoing generation | tts:generate |
| POST | /api/text/split | Split text into sentences | audio:process |
Voice Cloning & Design
| Method | Path | Description | Scope |
|---|---|---|---|
| POST | /api/clone/create-clips | Create voice clone (multi-clip with regions) | clone:create |
| POST | /api/clone/create | Create voice clone (single sample) | clone:create |
| POST | /api/clone/create-multi | Create voice clone (multi-sample) | clone:create |
| POST | /api/clone/cancel | Cancel clone creation | clone:create |
| POST | /api/design/create | Create voice from description | clone:create |
| POST | /api/design/cancel | Cancel design creation | clone:create |
| POST | /api/voices/{id}/generate-preview | Generate voice preview | clone:create |
Audio Processing
| Method | Path | Description | Scope |
|---|---|---|---|
| POST | /api/transcribe | Transcribe audio (Whisper) | audio:process |
| POST | /api/translate | Translate text between languages | audio:process |
| POST | /api/audio-duration | Get audio file duration | audio:process |
| GET | /api/audio/chunk/{task_id}/{index} | Retrieve generated audio chunk | tts:generate |
| POST | /api/download-video | Download audio from video URL | audio:process |
Timeline
| Method | Path | Description | Scope |
|---|---|---|---|
| GET | /api/timeline/{task_id} | Get timeline layout | tts:timeline |
| POST | /api/timeline/{task_id}/layout | Save block positions | tts:timeline |
| POST | /api/timeline/{task_id}/import-track | Import external audio track | tts:timeline |
| POST | /api/timeline/{task_id}/import-video | Import video/image for timeline | tts:timeline |
| POST | /api/generate/merge-timeline | Merge timeline into single file | tts:timeline |
Generations
| Method | Path | Description | Scope |
|---|---|---|---|
| GET | /api/generations | List generation history | voices:read |
| GET | /api/generations/{id} | Get generation details | voices:read |
| DELETE | /api/generations/{id} | Delete a generation | voices:write |
Montages
| Method | Path | Description | Scope |
|---|---|---|---|
| GET | /api/montages | List all montages | voices:read |
| POST | /api/montages | Create a new montage | voices:write |
| GET | /api/montages/{id} | Get montage with generations | voices:read |
| DELETE | /api/montages/{id} | Delete a montage | voices:write |
System
| Method | Path | Description | Scope |
|---|---|---|---|
| GET | /api/system/stats | CPU, RAM, GPU stats | system:read |
| GET | /api/system/status | Detailed system status | system:read |
| GET | /api/system/gpu | Detailed GPU information | system:read |
| GET | /api/system/info | Hardware info | system:read |
| GET | /api/queue/status | GPU queue status | system:read |
| POST | /api/job/cancel | Cancel a queued job | system:read |
| POST | /api/api-keys | Create API key | system:read |
| GET | /api/api-keys | List API keys | system:read |
| DELETE | /api/api-keys/{key_id} | Delete an API key | system:read |
| POST | /api/api-keys/{key_id}/revoke | Revoke an API key | system:read |
WebSocket
| Method | Path | Description | Scope |
|---|---|---|---|
| WS | /ws | Real-time progress updates | ws:connect |
Code Examples
Rate Limiting
Each API key has per-scope rate limits (requests per minute). When exceeded, the API returns HTTP 429. The default limits are configurable per key.
Scopes
Each endpoint requires a specific scope. Keys are granted all scopes by default.
| Scope | Description | Default limit/min |
|---|---|---|
tts:generate | Generate speech (TTS) | 10 |
tts:timeline | Timeline editor access | 30 |
voices:read | Read voices & categories | 60 |
voices:write | Modify / delete voices | 20 |
clone:create | Create voice clones | 2 |
audio:process | Process audio files | 5 |
system:read | Read system info | 60 |
ws:connect | WebSocket connection | 5 |