Changelog

July 12, 2025 · View on GitHub

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[1.0.0] - 2025-07-04

LiveTalk-Unity Package: Complete Unity package for real-time talking head generation
Dual-Pipeline System: LivePortrait for facial animation + MuseTalk for lip synchronization
Character Creation System: Create, save, and load characters with multiple expressions and voices
Advanced Character Management: Support for both folder and macOS bundle formats
Complete API: LiveTalkAPI singleton and Character class with full documentation
Expression Support: 7 built-in expressions (talk-neutral, approve, disapprove, smile, sad, surprised, confused)
Integrated TTS: Built-in SparkTTS integration for voice generation
Model Download Links: Pre-exported ONNX models for both LiveTalk and SparkTTS
Cross-Platform Support: macOS (CPU/CoreML), Windows (Not tested)
Performance Optimizations: CoreML optimization for efficient on-device inference

Real-time Animation: Generate talking head videos from avatar images and audio
Character Persistence: Save and load character data with precomputed expressions
Voice Synthesis: Create character voices with configurable pitch, speed, and gender
Frame Streaming: Efficient frame-by-frame processing with coroutine support
Multiple Input Formats: Support for images, videos, and directory-based driving frames
Bundle Support: macOS package format for seamless character distribution
Memory Management: Optimized memory usage with unsafe code and parallel processing

Overall Performance: 10-11 FPS for speech with lip sync on Mac M4 Max
Character Creation: 10 minutes per character on Mac M4 Max
LivePortrait Pipeline: 4 FPS
- motion_extractor (FP32): 30-60ms
- warping_spade (FP16): 180-250ms
- landmark_runner (FP32): 2-3ms
MuseTalk Pipeline: 11 FPS
- vae_encoder (FP16): 20-30ms
- unet (FP16): 30-40ms
- vae_decoder (FP16): 40-50ms