โ† Back to Speech & Transcription

local-whisper

Local speech-to-text using OpenAI Whisper

0
Source Code

Local Whisper STT

Local speech-to-text using OpenAI's Whisper. Fully offline after initial model download.

Usage

# Basic
~/.clawdbot/skills/local-whisper/scripts/local-whisper audio.wav

# Better model
~/.clawdbot/skills/local-whisper/scripts/local-whisper audio.wav --model turbo

# With timestamps
~/.clawdbot/skills/local-whisper/scripts/local-whisper audio.wav --timestamps --json

Models

Model Size Notes
tiny 39M Fastest
base 74M Default
small 244M Good balance
turbo 809M Best speed/quality
large-v3 1.5GB Maximum accuracy

Options

  • --model/-m โ€” Model size (default: base)
  • --language/-l โ€” Language code (auto-detect if omitted)
  • --timestamps/-t โ€” Include word timestamps
  • --json/-j โ€” JSON output
  • --quiet/-q โ€” Suppress progress

Setup

Uses uv-managed venv at .venv/. To reinstall:

cd ~/.clawdbot/skills/local-whisper
uv venv .venv --python 3.12
uv pip install --python .venv/bin/python click openai-whisper torch --index-url https://download.pytorch.org/whl/cpu