Source Code
EachLabs Voice & Audio
Text-to-speech, speech-to-text transcription, voice conversion, and audio utilities via the EachLabs Predictions API.
Authentication
Header: X-API-Key: <your-api-key>
Set the EACHLABS_API_KEY environment variable. Get your key at eachlabs.ai.
Available Models
Text-to-Speech
| Model | Slug | Best For |
|---|---|---|
| ElevenLabs TTS | elevenlabs-text-to-speech |
High quality TTS |
| ElevenLabs TTS w/ Timestamps | elevenlabs-text-to-speech-with-timestamp |
TTS with word timing |
| ElevenLabs Text to Dialogue | elevenlabs-text-to-dialogue |
Multi-speaker dialogue |
| ElevenLabs Sound Effects | elevenlabs-sound-effects |
Sound effect generation |
| ElevenLabs Voice Design v2 | elevenlabs-voice-design-v2 |
Custom voice design |
| Kling V1 TTS | kling-v1-tts |
Kling text-to-speech |
| Kokoro 82M | kokoro-82m |
Lightweight TTS |
| Play AI Dialog | play-ai-text-to-speech-dialog |
Dialog TTS |
| Stable Audio 2.5 | stable-audio-2-5-text-to-audio |
Text to audio |
Speech-to-Text
| Model | Slug | Best For |
|---|---|---|
| ElevenLabs Scribe v2 | elevenlabs-speech-to-text-scribe-v2 |
Best quality transcription |
| ElevenLabs STT | elevenlabs-speech-to-text |
Standard transcription |
| Wizper with Timestamp | wizper-with-timestamp |
Timestamped transcription |
| Wizper | wizper |
Basic transcription |
| Whisper | whisper |
Open-source transcription |
| Whisper Diarization | whisper-diarization |
Speaker identification |
| Incredibly Fast Whisper | incredibly-fast-whisper |
Fastest transcription |
Voice Conversion & Cloning
| Model | Slug | Best For |
|---|---|---|
| RVC v2 | rvc-v2 |
Voice conversion |
| Train RVC | train-rvc |
Train custom voice model |
| ElevenLabs Voice Clone | elevenlabs-voice-clone |
Voice cloning |
| ElevenLabs Voice Changer | elevenlabs-voice-changer |
Voice transformation |
| ElevenLabs Voice Design v3 | elevenlabs-voice-design-v3 |
Advanced voice design |
| ElevenLabs Dubbing | elevenlabs-dubbing |
Video dubbing |
| Chatterbox S2S | chatterbox-speech-to-speech |
Speech to speech |
| Open Voice | openvoice |
Open-source voice clone |
| XTTS v2 | xtts-v2 |
Multi-language voice clone |
| Stable Audio 2.5 Inpaint | stable-audio-2-5-inpaint |
Audio inpainting |
| Stable Audio 2.5 A2A | stable-audio-2-5-audio-to-audio |
Audio transformation |
| Audio Trimmer | audio-trimmer-with-fade |
Audio trimming with fade |
Audio Utilities
| Model | Slug | Best For |
|---|---|---|
| FFmpeg Merge Audio Video | ffmpeg-api-merge-audio-video |
Merge audio with video |
| Toolkit Video Convert | toolkit |
Video/audio conversion |
Prediction Flow
- Check model
GET https://api.eachlabs.ai/v1/model?slug=<slug>โ validates the model exists and returns therequest_schemawith exact input parameters. Always do this before creating a prediction to ensure correct inputs. - POST
https://api.eachlabs.ai/v1/predictionwith model slug, version"0.0.1", and input matching the schema - Poll
GET https://api.eachlabs.ai/v1/prediction/{id}until status is"success"or"failed" - Extract the output from the response
Examples
Text-to-Speech with ElevenLabs
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "elevenlabs-text-to-speech",
"version": "0.0.1",
"input": {
"text": "Welcome to our product demo. Today we will walk through the key features.",
"voice_id": "EXAVITQu4vr4xnSDxMaL",
"model_id": "eleven_v3",
"stability": 0.5,
"similarity_boost": 0.7
}
}'
Transcription with ElevenLabs Scribe
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "elevenlabs-speech-to-text-scribe-v2",
"version": "0.0.1",
"input": {
"media_url": "https://example.com/recording.mp3",
"diarize": true,
"timestamps_granularity": "word"
}
}'
Transcription with Wizper (Whisper)
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "wizper-with-timestamp",
"version": "0.0.1",
"input": {
"audio_url": "https://example.com/audio.mp3",
"language": "en",
"task": "transcribe",
"chunk_level": "segment"
}
}'
Speaker Diarization with Whisper
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "whisper-diarization",
"version": "0.0.1",
"input": {
"file_url": "https://example.com/meeting.mp3",
"num_speakers": 3,
"language": "en",
"group_segments": true
}
}'
Voice Conversion with RVC v2
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "rvc-v2",
"version": "0.0.1",
"input": {
"input_audio": "https://example.com/vocals.wav",
"rvc_model": "CUSTOM",
"custom_rvc_model_download_url": "https://example.com/my-voice-model.zip",
"pitch_change": 0,
"output_format": "wav"
}
}'
Merge Audio with Video
curl -X POST https://api.eachlabs.ai/v1/prediction \
-H "Content-Type: application/json" \
-H "X-API-Key: $EACHLABS_API_KEY" \
-d '{
"model": "ffmpeg-api-merge-audio-video",
"version": "0.0.1",
"input": {
"video_url": "https://example.com/video.mp4",
"audio_url": "https://example.com/narration.mp3",
"start_offset": 0
}
}'
ElevenLabs Voice IDs
The elevenlabs-text-to-speech model supports these voice IDs. Pass the raw ID string:
| Voice ID | Notes |
|---|---|
EXAVITQu4vr4xnSDxMaL |
Default voice |
9BWtsMINqrJLrRacOk9x |
โ |
CwhRBWXzGAHq8TQ4Fs17 |
โ |
FGY2WhTYpPnrIDTdsKH5 |
โ |
JBFqnCBsd6RMkjVDRZzb |
โ |
N2lVS1w4EtoT3dr4eOWO |
โ |
TX3LPaxmHKxFdv7VOQHJ |
โ |
XB0fDUnXU5powFXDhCwa |
โ |
onwK4e9ZLuTAKqWW03F9 |
โ |
pFZP5JQG7iQjIQuC4Bku |
โ |
Parameter Reference
See references/MODELS.md for complete parameter details for each model.