README.md

May 20, 2026 · View on GitHub

Live Translator

Live Translator

Real-time system audio translation for macOS
Translate any audio playing on your Mac — YouTube, podcasts, meetings, movies — live on screen.

Release Stars macOS Python License

Download    Buy Me a Coffee


Why?

You're watching a YouTube video in Japanese. A podcast in Arabic. A meeting in German. You don't speak these languages — but you want to understand everything, right now.

Live Translator sits quietly in your menu bar, captures whatever audio your Mac is playing, and shows you a live, flowing translation on screen. No copy-pasting, no tab switching, no waiting. Just play audio and read.

It's not a word-by-word subtitler. It uses a live document model — the AI sees the full conversation, maintains context, and produces natural translations that read like a human wrote them.

System Audio → Speech Recognition → AI Translation → Live Overlay
(ScreenCaptureKit)  (SFSpeechRecognizer)  (OpenAI GPT)     (WebKit Panel)

Demo

🇬🇧 English → 🇹🇷 Turkish

English to Turkish demo

🇯🇵 Japanese → 🇹🇷 Turkish

Japanese to Turkish demo

🇸🇦 Arabic → 🇹🇷 Turkish

Arabic to Turkish demo

🇸🇦 Arabic → Multi Language

Arabic to Multi Language demo

Features

  • Real-time translation of any audio on your Mac
  • 11 source languages — English, German, French, Spanish, Italian, Japanese, Chinese, Korean, Russian, Arabic, Portuguese
  • 12 target languages — translate into any supported language
  • Context-aware AI — maintains full conversation context, never loses track
  • Live document model — translation grows like a live document, new parts highlighted
  • Text-to-Speech — hear translations read aloud (Piper offline, OpenAI or Gemini voices)
  • Multi-provider — choose OpenAI or Google Gemini; just add your API key
  • Latest AI models — GPT-5.5, GPT-5.4 mini/nano, Gemini 2.5 Flash, Gemini 3.5 Flash, and more
  • Floating overlay — dark-themed panel, draggable, always on top
  • In-app settings — provider, API keys, model, TTS, languages — all configurable from the UI
  • No audio drivers needed — uses ScreenCaptureKit (macOS 13+)
  • On-device STT (default) — speech recognition works without internet; audio never leaves your Mac
  • Optional realtime mode — lower latency via OpenAI's Realtime Translation model; audio is streamed to the provider
  • Auto-recovery — watchdog detects and recovers from stuck states
  • Menu bar app — runs quietly with 🌐 icon in menu bar

What's New in v0.1.0

  • Multi-provider architecture — pick OpenAI or Google Gemini for translation; switch any time from Settings.
  • Latest models — refreshed model list (OpenAI GPT-5.5 / 5.4 mini / 5.4 nano, Gemini 2.5 Flash / Flash-Lite, Gemini 3.5 Flash). The model dropdown updates to match the selected provider.
  • Optional realtime mode — stream audio to OpenAI's Realtime Translation model for lower latency, with a clear privacy warning. On-device STT stays the default.
  • Dynamic system-audio capture — audio sources opened after the app starts are now captured automatically.
  • Redesigned Settings — grouped sections, provider-aware fields, dual API keys, and clearer controls.
  • More robust — graceful fallback and clear messaging when a provider, key, or connection fails; "no audio detected" hint when the source is silent.

Install

Download DMG
  1. Download → Open DMG → Drag to Applications
  2. Launch → Setup wizard guides you through everything
  3. Enter your OpenAI API key (get one here)
  4. Grant Screen & System Audio Recording permission when prompted
  5. Play any audio → translations appear live

First launch: If macOS blocks the app, right-click → Open, or run xattr -cr /Applications/Live\ Translator.app

Option 2: Homebrew

brew tap umutcetinkaya/tap
brew install --cask live-translator

Option 3: From Source

git clone https://github.com/umutcetinkaya/live-translator.git
cd live-translator
make install    # Create venv + install deps
make models     # Download TTS voice models (~580MB)
make run        # Launch

Quick Start

  1. Launch → floating panel appears + 🌐 in menu bar
  2. Click ⚙ Settings → enter your OpenAI API key
  3. Select source language (what's being spoken) and target language (what you want to read)
  4. Play any audio → translations appear in real-time
  5. New translations are highlighted so you always know what just changed

Controls

ControlAction
Source / Target dropdownsChange languages
⚙ SettingsAPI key, model, TTS provider, voice, speed
TTS Off / OnToggle text-to-speech
ClearReset translation history
Quit
Menu bar 🌐Pause, Show/Hide, Quit

Text-to-Speech

ProviderProsCons
Piper (default)Free, offline, fastRobotic voice
OpenAI TTSNatural voices (Nova, Shimmer, Alloy, Echo, Fable, Onyx)Costs money
Gemini TTSNatural voices via GeminiCosts money

TTS plays through the same process — the app's own voice is automatically filtered from capture. No feedback loops.

Supported Models

Pick a provider in Settings, paste its API key, and the model list updates automatically.

OpenAI

ModelSpeedCostBest For
GPT-5.4 mini⚡ Fastest¢Daily use (default)
GPT-5.4 nano⚡ Fastest¢Budget
GPT-5.5🚀 Fast$$Latest + Quality

Google Gemini

ModelSpeedCostBest For
Gemini 2.5 Flash⚡ Fastest¢Daily use
Gemini 2.5 Flash-Lite⚡ Fastest¢Budget
Gemini 3.5 Flash🚀 Fast$$Latest + Quality

Recommendation: GPT-4o Mini or GPT-4.1 Nano for real-time translation (fast + cheap).

How It Works

Live Translator runs two independent agents in parallel:

  1. Listener — continuously captures system audio via ScreenCaptureKit and transcribes it on-device using SFSpeechRecognizer
  2. Translator — every ~3 seconds, takes the full accumulated transcript and sends it to your chosen provider (OpenAI or Gemini)

The AI is instructed to preserve its previous translation and only append or refine new content. The overlay highlights what changed, so you can always follow along.

This means:

  • Context is never lost
  • Incomplete sentences get refined next cycle
  • The translation reads as a coherent, flowing document
  • No disconnected fragments

Realtime mode (optional)

By default, speech recognition runs on-device — your audio never leaves the Mac, only text is sent to the translation provider. For lower latency you can enable Realtime mode in Settings (OpenAI provider), which streams audio directly to OpenAI's Realtime Translation model and returns the translation as it speaks. This trades privacy for speed: audio is sent to the provider. A warning is shown in Settings when you enable it. On-device mode remains the default.

Architecture

live-translator/
├── main.py                      # Entry point + menu bar
├── src/
│   ├── audio_capture.py         # ScreenCaptureKit system audio
│   ├── speech_recognizer.py     # SFSpeechRecognizer + watchdog + ring buffer
│   ├── translator.py            # OpenAI live document translation
│   ├── pipeline.py              # Listener + Translator orchestrator
│   ├── overlay.py               # WebKit floating panel (HTML/CSS/JS)
│   ├── tts.py                   # Piper (offline) / OpenAI TTS
│   └── config.py                # JSON settings
├── models/                      # Piper voice models (downloaded separately)
├── scripts/
│   ├── build_app.sh             # Build .app bundle
│   ├── build_dmg.sh             # Create DMG installer
│   ├── download_models.sh       # Download TTS models
│   ├── notarize.sh              # Apple notarization
│   ├── setup_wizard.swift       # Native macOS setup wizard
│   └── launcher.c               # Native .app launcher
├── assets/                      # App icon + demo media
├── Makefile
├── requirements.txt
├── CONTRIBUTING.md
├── SECURITY.md
└── LICENSE

Configuration

Settings are stored in ~/.live-translator.json and can be changed from the in-app Settings panel:

{
  "openai_api_key": "sk-...",
  "source_locale": "en-US",
  "target_lang": "tr",
  "model": "gpt-5.4-mini",
  "tts_provider": "piper",
  "tts_voice": "nova",
  "tts_speed": 1.0
}

Troubleshooting

ProblemSolution
"Damaged and can't be opened"Run xattr -cr /Applications/Live\ Translator.app
No audio detectedGrant Screen & System Audio Recording permission, restart app
STT stops workingBuilt-in watchdog auto-recovers within 10 seconds
Translation not appearingCheck your provider's API key in Settings (OpenAI or Gemini)
TTS not workingCheck TTS provider in Settings
App not in dockBy design — it's a menu bar app (🌐)

Requirements

  • macOS 13 (Ventura) or later
  • Python 3.11+ (auto-installed via setup wizard if missing)
  • OpenAI API key (get one here)

If Live Translator helps you, give it a ⭐ on GitHub — it helps others find it!

Star on GitHub


Support the Project

Buy Me a Coffee

License

MIT — see LICENSE.

Copyright © 2025 Umut Çetinkaya

Credits