README.md

March 17, 2026 · View on GitHub

OwlWhisper icon

OwlWhisper

Local voice input for macOS — hold a hotkey, speak, release, text appears at your cursor.
Fully offline. No cloud APIs. Chinese-optimized with FireRedASR2.

Release Platform Architecture License

English | 中文


How It Works

  1. Hold your hotkey (default: Fn)
  2. Speak naturally
  3. Release — text is transcribed and pasted at your cursor

Everything runs locally on your Mac. No data leaves your device.

Features

  • Hold-to-speak — Global hotkey works in any app, fully configurable
  • Offline ASR — FireRedASR2 int8 via sherpa-onnx C API (2.89% CER on Chinese)
  • Auto punctuation — ct-transformer restores commas, periods, question marks
  • Auto paste — Transcribed text is pasted at cursor via simulated Cmd+V
  • VAD — Silero VAD trims silence, only speech is sent to ASR
  • Floating indicator — Waveform animation while recording, dots while transcribing
  • Auto model download — First launch downloads ~1.1GB of models automatically with resume support
  • i18n — English (default) + Chinese
  • Update checker — Checks GitHub Releases for new versions

Requirements

  • macOS 13.0+
  • Apple Silicon (arm64)

Install

  1. Download OwlWhisper-0.1.0.dmg from Releases
  2. Open DMG, drag OwlWhisper to /Applications
  3. Launch — models download automatically on first run (~1.1GB)

From Source

git clone https://github.com/sanvibyfish/OwlWhisper.git
cd OwlWhisper
scripts/setup.sh          # downloads native libs + models
open OwlWhisper/OwlWhisper.xcodeproj
# Build & Run (⌘R)

Permissions

On first launch, OwlWhisper will guide you through granting the required permissions:

PermissionPurpose
MicrophoneRecord your speech
AccessibilitySimulate Cmd+V paste and listen for global hotkeys

Architecture

OwlWhisper.app
├── ASRService.swift          # sherpa-onnx C API: VAD + ASR + punctuation
├── AudioRecorder.swift       # AVAudioEngine 16kHz mono capture
├── HotkeyManager.swift       # CGEventTap global hotkey detection
├── FloatingIndicator.swift   # Waveform + dots animation overlay
├── SettingsWindowController  # Auto Layout settings & onboarding
├── ModelDownloader.swift     # Chunked parallel download with resume
├── UpdateChecker.swift       # GitHub Release version check
├── MenubarController.swift   # Status bar icon + menu
├── PasteController.swift     # CGEvent Cmd+V simulation
├── AppDelegate.swift         # App lifecycle & orchestration
└── Frameworks/
    ├── libsherpa-onnx-c-api.dylib
    └── libonnxruntime.1.23.2.dylib

Models

Downloaded to ~/Library/Application Support/OwlWhisper/models/ on first launch:

ModelSizePurpose
FireRedASR2 int8~300MBChinese/English speech recognition
ct-transformer~270MBPunctuation restoration
Silero VAD~2MBVoice activity detection

Tech Stack

  • Swift 5.9, AppKit (no SwiftUI)
  • sherpa-onnx C API via bridging header
  • onnxruntime 1.23.2

License

GPL-3.0