README.md

June 2, 2026 · View on GitHub

OSTT logo

Open source voice-to-text for Linux and macOS

Features • Install • Quick Start • Processing • Docs

OSTT is a terminal-native speech-to-text tool. Record from a hotkey, transcribe with local Whisper-compatible models or your chosen cloud provider, then send the result to your clipboard, a file, stdout, an AI prompt, or any shell command. Local transcription runs offline and supports GPU acceleration through Metal on macOS and CUDA or Vulkan on Linux.

OSTT is built for people who treat the terminal as a normal place for voice input to land. It does not assume one vendor, one subscription, or one app-specific workflow: use offline local models, bring your own API key for OpenAI, Deepgram, Groq, DeepInfra, AssemblyAI, Berget, ElevenLabs, or Mistral, and retry the same recording with another model when needed. Voice becomes text that can move through the same tools as everything else.

Tip

Bind Alt+Space to ostt launch -c for a global hotkey popup. Press once to start recording, press again to stop and transcribe. Use Alt+Ctrl+Space with ostt launch -c -p for a popup with an action picker.

Features

Linux-first voice input - Global hotkey setup for Omarchy/Hyprland, GNOME, KDE, and other Linux desktops, with macOS support too.
Provider choice - Bring your own API key and switch between OpenAI, Deepgram, Groq, DeepInfra, AssemblyAI, Berget, ElevenLabs, Mistral, and local Whisper-compatible models.
Model-scoped params - Tune supported provider/model request params persistently or per run with --param key=value.
Local transcription models - Download curated local models or add custom Hugging Face/direct model files for offline transcription.
Terminal-native workflow - Use stdout, clipboard, files, aliases, shell completions, logs, and pipes.
Scriptable post-processing - Transform transcripts with AI prompts or bash commands using ostt -p and ostt process.
Retry without re-recording - Save recordings locally, then re-transcribe them with a different provider or model.
File transcription and replay - Transcribe existing audio files and replay saved recordings from history.
Keywords and custom vocabulary - Improve recognition for names, technical terms, and project-specific language.
Open source, no subscription - Public code, local configuration, and no vendor lock-in beyond the providers you choose.

Documentation

Full documentation is available at https://ostt.ai.

Start here:

Install

curl -fsSL https://ostt.ai/install | bash

The installer detects your platform, installs supported runtime dependencies, downloads the latest release, verifies its checksum, and installs the ostt CLI.

If you prefer platform package managers, see the docs for Homebrew, AUR, .deb, and .rpm options.

Quick Start

ostt auth           # Save cloud provider credentials
ostt model          # Choose cloud or local transcription model
ostt                # Record, transcribe, print to stdout
ostt -c             # Record, transcribe, copy to clipboard
ostt -m deepgram/nova-3 -c
ostt -m whisper/turbo --param language=sv -c
ostt launch -c      # Popup workflow for global hotkeys

By default, press Enter to stop and transcribe, Space to pause/resume, and Esc, q, or Ctrl+C to cancel.

Processing

Processing actions transform transcriptions after recording or from history.

ostt -p clean -c              # Record, transcribe, clean, copy
ostt launch -c -p clean       # Popup hotkey workflow with processing
ostt process                  # Process most recent history item, show picker
ostt process clean            # Process most recent history item with clean action
ostt process 3 clean -c       # Process history item #3 with clean action
ostt process list             # List configured actions

Actions are configured in ~/.config/ostt/ostt.toml and can run either bash commands or AI CLI tools. See Processing Actions for examples.

Common Commands

ostt                         # Record audio, print transcription
ostt -c                      # Record audio, copy transcription
ostt -o notes.txt            # Record audio, write transcription to file
ostt -m openai/whisper-1     # Override model for this run
ostt --param language=sv -c  # Override a transcription param for this run
ostt launch -c               # Open popup recorder
ostt transcribe file.mp3 -m deepinfra/openai/whisper-large-v3
ostt retry 2 -m groq/whisper-large-v3 -c
ostt replay                  # Play most recent recording
ostt model                   # Choose cloud or local transcription model
ostt model params whisper/turbo  # List supported params for a model
ostt history                 # Browse transcription history
ostt keyword                 # Manage transcription keywords
ostt config                  # Open config file
ostt config list-devices     # List audio input devices
ostt logs                    # View recent logs
ostt completions zsh         # Generate shell completions
ostt completions install bash    # Install completions system-wide
ostt --version               # Show version
ostt --help                  # Show help

Common aliases: r for record, t for transcribe, l for launch, p for process, a for auth, h for history, k for keyword, c for config, and rp for replay.

Providers

OSTT is bring-your-own-API-key and currently supports OpenAI, Deepgram, DeepInfra, Groq, AssemblyAI, Berget, ElevenLabs, and Mistral transcription models.

Run ostt auth to select your provider/model and save credentials securely.

Run ostt model to switch between authenticated cloud models and local models. The local model screen can download curated models, activate downloaded models, delete local model files, and add custom models from Hugging Face model pages or direct .gguf / ggml-*.bin URLs.

Per-provider and per-model params are configured under [provider.params] and [provider."model".params], or passed per run with --param key=value. See Providers and Models and Configuration for supported params.

Deprecated config shapes such as [providers], [model_options], [audio].sample_rate, [[process.actions]], and provider = "local" fail loudly. Update local model IDs from local/<model> to whisper/<model>.

Platform Setup

Suggested default keybindings:

Hotkey	Command	Action
`Alt+Space`	`ostt launch -c`	Popup recorder, clipboard output
`Alt+Ctrl+Space`	`ostt launch -c -p`	Popup with action picker

Platform-specific setup notes are available in the docs:

Development

git clone https://github.com/kristoferlund/ostt.git
cd ostt
cargo build
cargo test --all-targets --all-features
cargo clippy --all-targets --all-features

Release builds use the dist profile:

cargo build --profile dist --locked

Contributing

Contributions are welcome. Please open an issue or submit a pull request.

Contributors

_{axo bot}

License

MIT