FreeSpeak
April 10, 2025 · View on GitHub
A simple voice dictation application for Linux that captures audio, transcribes it, corrects grammar, and types the result into the currently active window.
I built this as an simple alternative to Talon since that does not support Wayland.
Features
- DBus Control: Exposes a DBus interface (
org.voice.Dictation) for potential external control (e.g., toggling recording). Includes a shell script to toggle recording that can be bound to a hotkey. - Voice Activity Detection (VAD): Only processes audio when speech is detected.
- Grammar Correction: Leverages a local LanguageTool server to improve punctuation and grammar of the transcribed text.
- System-Wide Typing: Uses
ydotoolto simulate keyboard input, allowing dictation into any application.
Installation
-
Install Python dependencies:
uv sync uv run src/main.py -
Install System Dependencies:
# Fedora sudo dnf install ydotool -
Configure ydotool:
To simulate typing, the program needs access to your /dev/uinput device. By default, this requires root privileges every time you run ydotool, so you'd have to enter your password every time you run this application.
To avoid that, you can give the program permanent access to the input device by adding your username to the input user group on your system and giving the group write access to the uinput device.
To do that, we use a udev rule. Udev is the Linux system that detects and reacts to devices getting plugged or unplugged on your computer. It also works with virtual devices like ydotool.
To add the current
$USERto a group, you can use the usermod command:# Set permissions (might be needed depending on your setup) # sudo gpasswd -a $USER input # Add user to 'input' group, then log out/in # Start and enable the user service systemctl --user enable ydotoold.service systemctl --user start ydotoold.service # Verify it's running systemctl --user status ydotoold.serviceYou then need to define a new udev rule that will give the input group permanent write access to the uinput device (this will give ydotool write access too).
Solution by https://github.com/ReimuNotMoe/ydotool/issues/25#issuecomment-535842993
echo '## Give ydotoold access to the uinput device KERNEL=="uinput", GROUP="input", MODE="0660", OPTIONS+="static_node=uinput" ' | sudo tee /etc/udev/rules.d/80-uinput.rules > /dev/nullYou will need to restart your computer for the change to take effect.
Finally, ydotool works with a daemon that you leave running in the background, ydotoold, for performance reasons. You need to run ydotoold before you start using ydotool.
systemctl --user enable ydotoold.service systemctl --user start ydotoold.service -
Set up LanguageTool: A docker-compose.yml file is provided to start a LanguageTool server.
docker-compose up -dN-gram datasets
You will want to add the ngram dataset for your language to improve the grammar correction.
LanguageTool can make use of large n-gram data sets to detect errors with words that are often confused, like their and there.
Source: https://dev.languagetool.org/finding-errors-using-n-gram-data
Download the ngram dataset for your language and put it in the
languagetool/ngramsdirectory.languagetool/ ├─ ngrams/ │ ├─ en/ │ │ ├─ 1grams/ │ │ ├─ 2grams/ │ │ ├─ 3grams/ │ ├─ es/ │ │ ├─ 1grams/ │ │ ├─ 2grams/ │ │ ├─ 3grams/Improving the spell checker
You can improve the spell checker without touching the dictionary. For single words (no spaces), you can add your words to one of these files:
spelling.txt: words that the spell checker will ignore and use to generate corrections if someone types a similar wordignore.txt: words that the spell checker will ignore but not use to generate correctionsprohibited.txt: words that should be considered incorrect even though the spell checker would accept them
Source: https://dev.languagetool.org/hunspell-support
These files are in the
languagetool/directory.