Deepgram Voice Keyboard for Linux
November 21, 2025 · View on GitHub

Status: Working prototype! Successfully types into applications on Wayland (KDE Plasma 25.10)
A system-level voice-to-text/dictation application for Ubuntu Linux using Deepgram's cloud STT API. Types wherever your cursor is - works across all applications on Wayland.
Installation
Option 1: Install from Debian Package (Recommended)
-
Download the latest
.debpackage from Releases -
Install:
sudo dpkg -i voice-keyboard_0.1.0_amd64.deb
sudo apt-get install -f # Fix dependencies if needed
-
Configure your Deepgram API key (choose one method):
Method 1 - Via config file (recommended):
# Edit the config file nano ~/.config/voice-keyboard/config.toml # Add your API key: # api_key = "your-deepgram-api-key-here"Method 2 - Via environment variable:
export DEEPGRAM_API_KEY="your-api-key-here" # Add to ~/.bashrc or ~/.zshrc for persistence -
Run:
# Test STT functionality
voice-keyboard-launcher --test-stt
# Or launch GUI (default)
voice-keyboard-launcher
Or launch from your application menu.
Option 2: Build from Source
- Set your Deepgram API key:
export DEEPGRAM_API_KEY="your-api-key-here"
- Run the application:
./run.sh --test-stt
The run script will automatically:
- Create a Python virtual environment (for future GUI components)
- Build the Rust application
- Run with proper privilege handling (sudo for keyboard creation, drops to user for audio)
Building a Debian Package
Use the unified package.sh script for all package management:
# Build only
./package.sh
# Build and install
./package.sh install
# Build and reinstall (removes old version first)
./package.sh reinstall
# Update version and build
./package.sh update 0.2.0
# Update version and install
./package.sh update 0.2.0 --install
# Show help
./package.sh --help
See PACKAGING.md for detailed packaging documentation.
Validated
✓ Text input into Kate text editor on Wayland (KDE Plasma 25.10) ✓ Cloud-based STT via Deepgram API ✓ Real-time transcription with on-the-fly punctuation
Project Structure
Voice-Typing-Ubuntu-App/
├── app/ # Main application code (Rust)
│ ├── src/ # Source files
│ │ ├── main.rs # Entry point with privilege handling
│ │ ├── audio_input.rs # Audio capture via CPAL
│ │ ├── stt_client.rs # Deepgram WebSocket client
│ │ ├── virtual_keyboard.rs # Virtual keyboard & typing logic
│ │ ├── input_event.rs # Linux input event definitions
│ │ ├── api_spend.rs # API cost tracking
│ │ └── gui/ # Qt GUI components
│ ├── Cargo.toml # Dependencies and build config
│ └── target/ # Build artifacts
├── package.sh # Unified package management (build/install/update)
├── run.sh # Development script to build and run
├── CLAUDE.md # AI assistant context
├── PACKAGING.md # Debian packaging documentation
└── README.md # This file
Roadmap
See PROJECT_NOTES.md for the complete feature roadmap and project context.
MVP Features (In Priority Order)
- Hotkey support - Configurable activation key (default: F13), tap-to-start/tap-to-stop, with PTT option
- Microphone handling - Device selection with persistent settings
- Model selection - Starting with Deepgram, future: multiple cloud providers
- GUI - Simple interface with system tray integration
- Type anywhere - ✓ Works on Wayland across all applications
- API spend monitoring - Track usage costs with hourly polling
Future Enhancements
- Mic level monitoring with dB display and alerts
- Post-processing for transcriptions
- Custom dictionary/replacements
- Long-form STT with chunking for extended recordings
Why This Project?
After a year of voice typing becoming my default input method, I needed an OS-level dictation solution that:
- Works with cloud STT APIs (not local inference)
- Types anywhere on Wayland (the critical technical hurdle)
- Supports quality real-time transcription with punctuation
- Provides cost visibility for API usage
This builds on Deepgram's voice-keyboard-linux starter which successfully solved the Wayland typing challenge.
Technical Details
- Language: Rust (core), Python (future GUI)
- STT Provider: Deepgram cloud API
- OS: Ubuntu 25.10 with KDE Plasma on Wayland
- Audio: PipeWire via CPAL
- Input: Virtual keyboard device via uinput
Credits
Based on Deepgram's voice-keyboard-linux starter project.