Documentación de la API
ClonyVoice proporciona una API REST local que te permite integrar texto a voz, clonación de voz y procesamiento de audio en tus propias aplicaciones. La API se ejecuta en tu máquina cada vez que ClonyVoice está abierto.
URL Base
La API está disponible localmente en:
Autenticación
Todas las solicitudes a la API requieren una clave API enviada a través del encabezado HTTP X-API-Key. Crea claves en la aplicación de escritorio ClonyVoice en la pestaña API.
Endpoints
Voices
| Method | Path | Description | Scope |
|---|---|---|---|
| GET | /api/voices | List all voices | voices:read |
| GET | /api/voices/{id} | Get voice details | voices:read |
| PUT | /api/voices/{id} | Update voice metadata | voices:write |
| DELETE | /api/voices/{id} | Delete voice | voices:write |
| GET | /api/categories | List categories | voices:read |
| POST | /api/categories | Create a category | voices:write |
| DELETE | /api/categories | Delete a category | voices:write |
| GET | /api/favorites | Get favorite voices | voices:read |
| POST | /api/favorites/{id} | Toggle favorite | voices:write |
| POST | /api/voices/export | Export voices (.clonyvoice) | voices:read |
| POST | /api/voices/import/preview | Preview import file | voices:write |
| POST | /api/voices/import | Import voices | voices:write |
Text-to-Speech (TTS)
| Method | Path | Description | Scope |
|---|---|---|---|
| POST | /api/generate/clone | Generate speech (cloned voice) | tts:generate |
| POST | /api/generate/preset | Generate speech (preset voice) | tts:generate |
| POST | /api/generate/clone-chunked | Chunked generation (per sentence) | tts:generate |
| POST | /api/generate/clone-chunked-multivoice | Multi-voice chunked generation | tts:generate |
| POST | /api/generate/clone-chunked-multilang | Multi-language chunked generation | tts:generate |
| POST | /api/generate/regenerate-chunk | Regenerate a single chunk | tts:generate |
| POST | /api/generate/merge-chunks | Merge chunks into final audio | tts:generate |
| POST | /api/generation/cancel | Cancel ongoing generation | tts:generate |
| POST | /api/text/split | Split text into sentences | audio:process |
Voice Cloning & Design
| Method | Path | Description | Scope |
|---|---|---|---|
| POST | /api/clone/create-clips | Create voice clone (multi-clip with regions) | clone:create |
| POST | /api/clone/create | Create voice clone (single sample) | clone:create |
| POST | /api/clone/create-multi | Create voice clone (multi-sample) | clone:create |
| POST | /api/clone/cancel | Cancel clone creation | clone:create |
| POST | /api/design/create | Create voice from description | clone:create |
| POST | /api/design/cancel | Cancel design creation | clone:create |
| POST | /api/voices/{id}/generate-preview | Generate voice preview | clone:create |
Audio Processing
| Method | Path | Description | Scope |
|---|---|---|---|
| POST | /api/transcribe | Transcribe audio (Whisper) | audio:process |
| POST | /api/translate | Translate text between languages | audio:process |
| POST | /api/audio-duration | Get audio file duration | audio:process |
| GET | /api/audio/chunk/{task_id}/{index} | Retrieve generated audio chunk | tts:generate |
| POST | /api/download-video | Download audio from video URL | audio:process |
Timeline
| Method | Path | Description | Scope |
|---|---|---|---|
| GET | /api/timeline/{task_id} | Get timeline layout | tts:timeline |
| POST | /api/timeline/{task_id}/layout | Save block positions | tts:timeline |
| POST | /api/timeline/{task_id}/import-track | Import external audio track | tts:timeline |
| POST | /api/timeline/{task_id}/import-video | Import video/image for timeline | tts:timeline |
| POST | /api/generate/merge-timeline | Merge timeline into single file | tts:timeline |
Generations
| Method | Path | Description | Scope |
|---|---|---|---|
| GET | /api/generations | List generation history | voices:read |
| GET | /api/generations/{id} | Get generation details | voices:read |
| DELETE | /api/generations/{id} | Delete a generation | voices:write |
Montages
| Method | Path | Description | Scope |
|---|---|---|---|
| GET | /api/montages | List all montages | voices:read |
| POST | /api/montages | Create a new montage | voices:write |
| GET | /api/montages/{id} | Get montage with generations | voices:read |
| DELETE | /api/montages/{id} | Delete a montage | voices:write |
System
| Method | Path | Description | Scope |
|---|---|---|---|
| GET | /api/system/stats | CPU, RAM, GPU stats | system:read |
| GET | /api/system/status | Detailed system status | system:read |
| GET | /api/system/gpu | Detailed GPU information | system:read |
| GET | /api/system/info | Hardware info | system:read |
| GET | /api/queue/status | GPU queue status | system:read |
| POST | /api/job/cancel | Cancel a queued job | system:read |
| POST | /api/api-keys | Create API key | system:read |
| GET | /api/api-keys | List API keys | system:read |
| DELETE | /api/api-keys/{key_id} | Delete an API key | system:read |
| POST | /api/api-keys/{key_id}/revoke | Revoke an API key | system:read |
WebSocket
| Method | Path | Description | Scope |
|---|---|---|---|
| WS | /ws | Real-time progress updates | ws:connect |
Ejemplos de Código
Limitación de Solicitudes
Cada clave API tiene límites por ámbito (solicitudes por minuto). Cuando se exceden, la API devuelve HTTP 429. Los límites predeterminados son configurables por clave.
Ámbitos
Cada endpoint requiere un ámbito específico. Las claves reciben todos los ámbitos por defecto.
| Scope | Description | Default limit/min |
|---|---|---|
tts:generate | Generate speech (TTS) | 10 |
tts:timeline | Timeline editor access | 30 |
voices:read | Read voices & categories | 60 |
voices:write | Modify / delete voices | 20 |
clone:create | Create voice clones | 2 |
audio:process | Process audio files | 5 |
system:read | Read system info | 60 |
ws:connect | WebSocket connection | 5 |