Source Code
Local Whisper
Transcribe voice messages for free on Telegram and WhatsApp. No API keys. No costs. Runs on your Mac.
The Problem
Voice transcription APIs cost money:
- OpenAI Whisper: $0.006/minute
- Groq: $0.001/minute
- AssemblyAI: $0.01/minute
If you transcribe a lot of Telegram voice messages, it adds up.
The Solution
This skill runs Whisper locally on your Mac. Same quality, zero cost.
- โ Free forever
- โ Private (audio never leaves your Mac)
- โ Fast (~1 second per message)
- โ Works offline
โ ๏ธ Important Notes
- First run downloads ~1.5GB model โ be patient, this only happens once
- First transcription is slow โ model loads into memory (~10-30 seconds), then it's instant
- Already using OpenAI API for transcription? Replace your existing
tools.media.audioconfig with the one below
Quick Start
1. Install dependencies
pip3 install -r requirements.txt
2. Start the daemon
python3 scripts/daemon.py
First run will download the Whisper model (~1.5GB). Wait for "Ready" message.
3. Add to OpenClaw config
Add this to your ~/.openclaw/openclaw.json:
{
"tools": {
"media": {
"audio": {
"enabled": true,
"models": [
{
"type": "cli",
"command": "~/.openclaw/workspace/skills/local-whisper/scripts/transcribe.sh",
"args": ["{{MediaPath}}"],
"timeoutSeconds": 60
}
]
}
}
}
}
4. Restart gateway
openclaw gateway restart
Now voice messages from Telegram, WhatsApp, etc. will be transcribed locally for free!
Manual test
./scripts/transcribe.sh voice_message.ogg
Use Case: Telegram Voice Messages
Instead of paying for OpenAI API to transcribe incoming voice messages, point OpenClaw to this local daemon. Free transcription forever.
Auto-Start on Login
cp com.local-whisper.plist ~/Library/LaunchAgents/
launchctl load ~/Library/LaunchAgents/com.local-whisper.plist
API
Daemon runs at localhost:8787:
curl -X POST http://localhost:8787/transcribe -F "file=@audio.ogg"
# {"text": "Hello world", "language": "en"}
Translation
Any language โ English:
./scripts/transcribe.sh spanish_audio.ogg --translate
Requirements
- macOS with Apple Silicon (M1/M2/M3/M4)
- Python 3.9+
License
MIT