← Back to Speech & Transcription

inworld-tts

Text-to-speech via Inworld.ai API

0
Source Code

Inworld TTS

Generate speech audio from text using Inworld.ai's TTS API.

Setup

  1. Get API key from https://platform.inworld.ai
  2. Generate key with "Voices: Read" permission
  3. Copy the "Basic (Base64)" key
  4. Set environment variable:
export INWORLD_API_KEY="your-base64-key-here"

For persistence, add to ~/.bashrc or ~/.clawdbot/.env.

Installation

# Copy skill to your skills directory
cp -r inworld-tts /path/to/your/skills/

# Make script executable
chmod +x /path/to/your/skills/inworld-tts/scripts/tts.sh

# Optional: symlink for global access
ln -sf /path/to/your/skills/inworld-tts/scripts/tts.sh /usr/local/bin/inworld-tts

Usage

# Basic
./scripts/tts.sh "Hello world" output.mp3

# With options
./scripts/tts.sh "Hello world" output.mp3 --voice Dennis --rate 1.2

# Streaming (for text >4000 chars)
./scripts/tts.sh "Very long text..." output.mp3 --stream

Options

Option Default Description
--voice Dennis Voice ID
--rate 1.0 Speaking rate (0.5-2.0)
--temp 1.1 Temperature (0.1-2.0)
--model inworld-tts-1.5-max Model ID
--stream false Use streaming endpoint

API Reference

Endpoint Use
POST https://api.inworld.ai/tts/v1/voice Standard synthesis
POST https://api.inworld.ai/tts/v1/voice:stream Streaming for long text

Requirements

  • curl - HTTP requests
  • jq - JSON processing
  • base64 - Decode audio

Examples

# Quick test
export INWORLD_API_KEY="aXM2..."
./scripts/tts.sh "Testing one two three" test.mp3
mpv test.mp3  # or any audio player

# Different voice and speed
./scripts/tts.sh "Slow and steady" slow.mp3 --rate 0.8

# Fast-paced narration
./scripts/tts.sh "Breaking news!" fast.mp3 --rate 1.5

Troubleshooting

"INWORLD_API_KEY not set" - Export the environment variable before running.

Empty output file - Check API key is valid and has "Voices: Read" permission.

Streaming issues - Ensure jq supports --unbuffered flag.

Links