speech2keys
November 5, 2025 · View on GitHub
A fast, lightweight Linux tool that converts speech to text and types it into any window using OpenAI's Whisper API.
Features
- Fast startup: Optimized Rust binary starts in milliseconds
- Wayland native: Built-in virtual keyboard support with automatic compositor detection
- Multi-compositor support: Works on Sway, Hyprland, KDE Plasma, and others via fallback mechanisms
- Smart stopping: Automatically stops after 8 seconds of silence
- Single instance: Toggle on/off by pressing your hotkey twice
- Visual feedback: Desktop notifications show recording status (KDE Plasma compatible)
Prerequisites
-
Rust toolchain (for building):
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -
Build dependencies (development libraries):
# Fedora sudo dnf install alsa-lib-devel wayland-devel wayland-protocols-devel libxkbcommon-devel # Arch sudo pacman -S alsa-lib wayland wayland-protocols libxkbcommon # Ubuntu/Debian sudo apt install libasound2-dev libwayland-dev wayland-protocols libxkbcommon-dev # Using Homebrew (Linux) brew install alsa-lib wayland wayland-protocols libxkbcommon -
Runtime dependencies (KDE Plasma users only):
KDE Plasma/KWin does not support the standard Wayland virtual keyboard protocol. You need one of these fallback tools:
# Option 1: kwtype (recommended for KDE Plasma) sudo dnf install kwtype # Fedora # Or build from: https://github.com/Sporif/KWtype # Option 2: wtype (works on most compositors) sudo dnf install wtype # Fedora sudo pacman -S wtype # Arch sudo apt install wtype # Ubuntu/DebianNote: Compositors like Sway, Hyprland, and Cosmic don't need external tools - they work out of the box!
-
OpenAI API key:
- Sign up at https://platform.openai.com
- Create an API key
- Add to your shell profile (e.g.,
~/.bashrcor~/.zshrc):export OPENAI_API_KEY="sk-..."
Installation
-
Clone and build:
cd /path/to/speech2keys cargo build --releaseNote: If you installed dependencies via Homebrew, you need to set
PKG_CONFIG_PATHandRUSTFLAGS:PKG_CONFIG_PATH="/home/linuxbrew/.linuxbrew/lib/pkgconfig:$PKG_CONFIG_PATH" \ RUSTFLAGS="-L /home/linuxbrew/.linuxbrew/lib" \ cargo build --release -
The binary will be at
target/release/speech2keys(approximately 3.8MB) -
(Optional) Install to your PATH:
sudo cp target/release/speech2keys /usr/local/bin/
Usage
Automatic Compositor Detection
speech2keys automatically detects the best text injection method for your compositor:
- First choice: Native Wayland virtual keyboard protocol (Sway, Hyprland, Cosmic, Niri, etc.)
- KDE Plasma fallback:
kwtypecommand if available - Universal fallback:
wtypecommand if available
You'll see a log message on startup indicating which method was selected.
Command Line
Simply run:
speech2keys
The program will:
- Auto-detect the best injection method for your compositor
- Start recording from your default microphone
- Show a notification that it's recording
- Transcribe speech and type it into the active window
- Stop after 8 seconds of silence
To stop early, run speech2keys again (it will signal the existing instance to stop).
KDE Plasma Global Shortcut
- Open System Settings → Shortcuts → Custom Shortcuts
- Click Edit → New → Global Shortcut → Command/URL
- Set:
- Trigger: Your preferred key combo (e.g.,
Meta+Shift+S) - Action:
/path/to/speech2keys(or justspeech2keysif in PATH)
- Trigger: Your preferred key combo (e.g.,
- Click Apply
Now you can press your hotkey to start/stop recording from anywhere!
Example Workflow
- Press your hotkey (e.g.,
Meta+Shift+S) - See notification: "Recording... Press the hotkey again to stop."
- Start speaking
- Watch as your words appear in the active window
- Stop speaking for 8 seconds, or press the hotkey again
Configuration
Change Language
Edit src/transcribe.rs and change the language parameter:
.language("en") // Change to "es", "fr", "de", etc.
Then rebuild:
cargo build --release
Adjust Silence Timeout
Edit src/transcribe.rs and change:
const SILENCE_TIMEOUT_SECS: u64 = 8; // Change to desired seconds
Adjust Transcription Chunk Size
Edit src/transcribe.rs and change:
const CHUNK_DURATION_SECS: u64 = 2; // Smaller = faster, larger = more accurate
Troubleshooting
"OPENAI_API_KEY environment variable not set"
Make sure you've exported your API key in your shell profile and restarted your terminal/session.
"Failed to create Wayland virtual keyboard client"
- Make sure you're running on Wayland (check with:
echo $XDG_SESSION_TYPE) - Ensure you have the required runtime libraries installed (libxkbcommon, wayland-client)
- Check that your Wayland compositor supports the virtual keyboard protocol
No audio input
- Check your default microphone in system settings
- Test with:
pactl list sources shortorpipewire-cli list-objects
Text not appearing
- Verify you're on Wayland (check with:
echo $XDG_SESSION_TYPE) - Check that the target window has focus
- Some applications may not accept virtual keyboard input
Cost Estimate
OpenAI Whisper API pricing (as of 2024):
- ~$0.006 per minute of audio
- Example: 5 minutes of daily use = ~$0.03/day = ~$0.90/month
Technical Details
- Language: Rust
- Audio capture: cpal (PipeWire/PulseAudio)
- Transcription: OpenAI Whisper API via async-openai
- Keystroke injection: wrtype library (Wayland virtual keyboard protocol)
- Notifications: notify-rust (D-Bus)
License
MIT
Contributing
Issues and pull requests welcome!