PT - Audio Transcription Tool
January 7, 2026 ยท View on GitHub

An audio transcription tool based on Next.js and OpenAI Whisper API, supporting audio file transcription and intelligent summary generation.
โจ Features
- ๐ฏ Support both file upload and URL input
- ๐๏ธ Support for Xiaoyuzhou podcast transcription
- ๐ High-quality audio transcription using OpenAI Whisper API
- ๐ AI-powered content summarization
- ๐จ Modern UI design
- ๐พ Download transcripts and summaries
- ๐ต Built-in audio player
- ๐ฅ๏ธ CLI tool support (
ptcommand) - ๐ SRT subtitle format output
- ๐ Chunked processing for large audio files
- โก Parallel transcription for better performance
- ๐ค Multiple output formats (text, JSON, markdown, SRT)
๐ฆ CLI Installation
Install via npm
npm install -g @winterfx/pt
Configure API Key
Choose one of the following methods:
Option 1: Environment Variable (Recommended)
# Add to ~/.zshrc or ~/.bashrc
export API_KEY="your-api-key"
export BASE_URL="https://api.openai.com/v1" # optional
Option 2: Config File (~/.pt/.env)
mkdir -p ~/.pt
cat > ~/.pt/.env << 'EOF'
API_KEY=your-api-key
BASE_URL=https://api.openai.com/v1
EOF
Option 3: Current Directory (.env)
cat > .env << 'EOF'
API_KEY=your-api-key
BASE_URL=https://api.openai.com/v1
EOF
๐ Web App Development
Prerequisites
- Node.js 18+
- OpenAI API Key
- FFmpeg (required for audio processing)
Installing FFmpeg
# macOS
brew install ffmpeg
# Linux (Ubuntu/Debian)
sudo apt-get install ffmpeg
# Windows
choco install ffmpeg
Installation
- Clone the repository:
git clone https://github.com/yourusername/podcast-transcription.git
cd podcast-transcription
- Install dependencies:
npm install
# or
yarn install
# or
pnpm install
- Configure environment variables:
Create a
.env.localfile and add:
API_KEY=your_openai_api_key
BASE_URL=your_endpoint
- Start the development server:
npm run dev
# or
yarn dev
# or
pnpm dev
Visit http://localhost:3000 to view the app.
Docker Deployment
- Build the Docker image:
docker build -t podcast-transcription .
- Run the container:
docker run -p 3000:3000 podcast-transcription
๐ฅ๏ธ CLI Tool
The project includes a command-line tool pt for transcribing audio files directly from the terminal.
CLI Usage
pt <input> [options]
Arguments:
<input>- Local file path or audio URL
Options:
-s, --summary- Generate AI summary after transcription-l, --language <lang>- Language code:auto,en,zh, etc. (default:auto)-o, --output <file>- Output file path (default: stdout)--output-format <format>- Output format:text,json,markdown,srt(default:text)-q, --quiet- Suppress progress output
CLI Examples
# Transcribe a local audio file
pt /path/to/podcast.mp3
# Transcribe with AI summary
pt podcast.mp3 --summary
# Generate SRT subtitles
pt podcast.mp3 --output-format srt -o subtitles.srt
# JSON output with summary
pt podcast.mp3 --summary --output-format json -o result.json
# Transcribe from URL
pt https://example.com/audio.mp3 --summary
Running the CLI
# Via npm script
npm run pt <input> [options]
# Or after global install
npm link
pt <input> [options]
๐ค Contributing
Pull Requests and Issues are welcome!
๐ License
MIT License - See LICENSE file for details.