Speech.sh

March 19, 2026 ยท View on GitHub

A text-to-speech CLI and MCP server using the Groq TTS API (OpenAI-compatible).

Features

  • Convert text to speech with a simple command
  • Multiple voice options (troy, austin, hannah, autumn)
  • Adjustable speech speed
  • Hash-based caching to avoid duplicate API calls (24h auto-cleanup)
  • Retry with exponential backoff
  • Audio playback via ffplay, mplayer, or VLC
  • MCP server for integration with AI assistants (Claude Desktop, Claude Code)

Quick Start

git clone https://github.com/j3k0/speech.sh.git
cd speech.sh
export OPENAI_API_KEY="your-groq-api-key"
./speech.sh --text "Hello, world!"

Dependencies

  • curl, jq (for the shell version)
  • One audio player: ffplay (from ffmpeg), mplayer, or vlc

CLI Usage

# Basic
./speech.sh --text "Hello, world!"

# With options
./speech.sh --text "Hello!" --voice austin --speed 1.2 --verbose

Options

-t, --text TEXT       Text to convert to speech (required)
-v, --voice VOICE     Voice to use (default: troy)
-s, --speed SPEED     Speech speed (default: 1.0)
-o, --output FILE     Output file path (default: auto-generated)
-a, --api_key KEY     API key
-m, --model MODEL     TTS model (default: canopylabs/orpheus-v1-english)
-p, --player PLAYER   Audio player: auto, ffmpeg, mplayer, vlc (default: auto)
-r, --retries N       Retry attempts (default: 3)
-T, --timeout N       Timeout in seconds (default: 30)
    --verbose         Enable verbose logging

API Key

Provide your Groq API key in one of three ways (in order of precedence):

  1. --api_key "your-key"
  2. export OPENAI_API_KEY="your-key"
  3. A file named API_KEY in the script's directory

MCP Server

Two implementations are available:

Uses the FastMCP SDK. Requires Python 3.10+ and uv.

# Setup
uv venv --python python3 .venv
uv pip install --python .venv/bin/python "mcp[cli]" httpx

# Run
OPENAI_API_KEY="your-key" .venv/bin/python server.py

Claude Desktop / Claude Code configuration

{
  "mcpServers": {
    "speak": {
      "command": "/path/to/speech.sh/.venv/bin/python",
      "args": ["/path/to/speech.sh/server.py"],
      "env": {
        "OPENAI_API_KEY": "your-groq-api-key",
        "SPEECH_VOICE": "troy",
        "SPEECH_SPEED": "1.0",
        "SPEECH_MODEL": "canopylabs/orpheus-v1-english"
      }
    }
  }
}

Shell (legacy)

The original shell-based MCP server (mcp.sh). Works in environments without Python but may hit macOS sandboxing issues with Claude Desktop.

./mcp.sh

MCP Tool

The server exposes a single speak tool:

ParameterTypeRequiredDefaultDescription
textstringyesThe text to speak
voicestringnotroyVoice to use
speednumberno1.0Speech speed

Environment Variables

VariableDescriptionDefault
OPENAI_API_KEYGroq API key(required)
SPEECH_VOICEDefault voicetroy
SPEECH_SPEEDDefault speed1.0
SPEECH_MODELTTS modelcanopylabs/orpheus-v1-english
SPEECH_API_URLAPI endpoint (Python)https://api.groq.com/openai/v1/audio/speech

Architecture

  • speech.sh - Shell-based TTS engine (API calls, caching, playback)
  • mcp.sh - Shell-based MCP wrapper over speech.sh (JSON-RPC 2.0 over stdio)
  • server.py - Python MCP server, self-contained replacement for both scripts above

License

GPL