README.md

December 9, 2025 · View on GitHub

🔥 Awesome Personalized Video Creation

If you like our project, please give us a star ⭐ on GitHub for the latest update.

GitHub Repo stars

This repository is dedicated to collecting, organizing, and tracking recent advancements in personalized video generation and editing. It serves as a centralized resource for papers, models, and benchmarks in this rapidly evolving field.

Table

📣 Update News
⚡ Contributing
📚 Preliminaries
🌐 Open-Domain Personalized Video Generation Models
🧑 Human-Domain Personalized Video Generation Models
💼 Commercial Personalized Video Generation Models
📈 Datasets and Benchmarks
👍 Acknowledgement

📣 Update News

[2024-07-18] We have initiated the repository.

⚡ Contributing

If you want to add your work to this list, please do not hesitate to email jhuang90@ur.rochester.edu or pull requests. Markdown format:

* | [**Paper Title**] | Venue | Date | [[paper]](link) [[code]](link) [[project]](link)|

📚 Preliminaries

📽️ Video Generation Foundation Models

🌀 Diffusion Transformer

🌀 U-Net

🌀 Autoregressive

🕳️ Control Paradigms in Video Generation

📌 Structure-aware Control Modules

Controlnet
T2IAdapter
AnyI2V: Animating Any Conditional Image with Motion Control, arXiv 2025, Paper

📌 Parameter-efficient Adaptation

📌 Localized Editing

Inpainting

🌐 Open-Domain Personalized Video Generation Models

🎨 Subject-Driven Video Generation Models

Test-time Fine-tuning

Title	Venue	Date	Links
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models	CVPR 2024	Dec 2023 (arXiv)	Paper – Project - Code
VideoBooth: Diffusion-based Video Generation with Image Prompts	CVPR 2024	Dec 2023 (arXiv)	Paper – Project – Code
CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects	arXiv	Jan 18 2024	Paper – Project
DisenStudio: Customized Multi-subject Text-to-Video Generation with Disentangled Spatial Control	ACMMM 2024	May 21 2024	Paper – Project - Code
Still-Moving: Customized Video Generation without Customized Video Data	TOG	Jul 11 2024	Paper – Project
Customcrafter: Customized Video Generation with Preserving Motion and Concept Composition Abilities	AAAI 2025	Feb 2025	Paper – Code
Dynamic Concepts Personalization from Single Videos	SIGGRAPH 2025	Feb 20 2025	Paper – Page
BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation	arXiv	May 11 2025	Paper

Pretrained Adaptation

Title	Venue	Date	Links
Movie Gen: A Cast of Media Foundation Models	arXiv	Oct 17 2024	Paper – Project
SUGAR: Subject-Driven Video Customization in a Zero-Shot Manner	arXiv	Dec 13 2024	Paper – Project
VideoMaker: Zero-shot Customized Video Generation with the Inherent Force of Video Diffusion Models	arXiv	Dec 27 2024	Paper – Code
Multi-subject Open-set Personalization in Video Generation	CVPR 2025	Jan 2025 (arXiv)	Paper – Project – Code
ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning	arXiv	Jan 2025	Paper
AnyCharV: Bootstrap Controllable Character Video Generation with Fine-to-Coarse Guidance	arXiv	Feb 2025	Paper – Code
Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts	CVPR 2025	Feb 2025	Paper
Phantom: Subject-Consistent Video Generation via Cross-Modal Alignment	ICCV 2025	Feb 16 2025	Paper – Project – Code
SkyReels-A2: Compose Anything in Video Diffusion Transformers	arXiv	Apr 3 2025	Paper – Project – Code
CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance	arXiv	Mar 13 2025	Paper
MAGREF: Masked Guidance for Any-Reference Video Generation	arXiv	May 29 2025	Paper Code
Tora2: Motion and Appearance Customized DiffusionTransformer for Multi-Entity Video Generation	arXiv	Jul 08 2025	Paper
BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration	arXiv	Oct 1 2025	Paper Page
Kaleido: Open-Sourced Multi-Subject Reference Video Generation Model	arXiv	Oct 21 2025	Paper Code
First Frame Is the Place to Go for Video Content Customization	arXiv	Nov 19 2025	Paper Code

🎥 Motion-Driven Video Generation Models

Title	Venue	Date	Links
Structure and Content-Guided Video Synthesis with Diffusion Models	ICCV 2023	Feb 2023	Paper
VideoComposer: Compositional Video Synthesis with Motion Controllability	NeurIPS 2023	Jun 2023 (arXiv)	Paper – Project - Code
DreamVideo: Composing Your Dream Videos with Customized Subject and Motion	CVPR 2024	Dec 2023 (arXiv)	Paper – Project - Code
Customize-A-Video: One-Shot Motion Customization of Text-to-Video Diffusion Models	ECCV 2024	Feb 2024	Paper - Project - Code
MotionBooth: Motion-Aware Customized Text-to-Video Generation	NeurIPS 2024 (Spotlight)	Jun 2024	Paper - Project - Code
DreamVideo-2: Zero-Shot Subject-Driven Video Customization with Precise Motion Control	arXiv	Oct 17 2024	Paper – Page
MoTrans: Customized Motion Transfer with Text-driven Video Diffusion Models	ACMMM 2024	Dec 2 2024	Paper – Code
Subject-driven Video Generation via Disentangled Identity and Motion	arXiv	Apr 23 2025	Paper – Code
DualReal: Adaptive Joint Training for Lossless Identity-Motion Fusion in Video Customization	arXiv	Mar 4 2025	Paper – Project
VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models	CVPR 2025	Mar 13 2025	Paper Project
DreamRunner: Fine-Grained Compositional Story-to-Video Generation with Retrieval-Augmented Motion Adaptation	Arxiv	Mar 18 2025	Paper - Project - Code
JointTuner: Appearance-Motion Adaptive Joint Training for Customized Video Generation	arXiv	Mar 31 2025	Paper – Project
PolyVivid: Vivid Multi-Subject Video Generation with Cross-Modal Interaction and Enhancement	arXiv	Jun 9 2025	Paper
CoMo: Compositional Motion Customization for Text-to-Video Generation	arXiv	Oct 27 2025	Paper - Page
MotionStream: Real-Time Video Generation with Interactive Motion Controls	arXiv	Nov 03 2025	Paper - Page - [https://github.com/alex4727/motionstream]
MultiMotion: Multi Subject Video Motion Transfer via Video Diffusion Transformer	arXiv	Dec 08 2025	Paper

✂️ Personalized Video Editing Models

Title	Venue	Date	Links
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation	ICCV 2023	Dec 22 2022	Code Paper
Dreamix: Video Diffusion Models are General Video Editors	arXiv	Feb 2023	Paper – Project
Make-A-Protagonist: Generic Video Editing with Visual and Textual Clues	arXiv	May 15 2023	Paper – Code
Towards Consistent Video Editing with Text-to-Image Diffusion Models	NeurIPS 2023	May 27 2023	Paper
Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance	TVCG 2024	Jun 2023	Paper – Code
MagicEdit: High-Fidelity and Temporally Coherent Video Editing	arXiv	Aug 28 2023	Paper – Code - Page
Cut-and-Paste: Subject-Driven Video Editing with Attention Control	arXiv	Nov 20 2023	Paper – Code
DragVideo: Interactive Drag-style Video Editing	ECCV 2024	Dec 3 2023	Paper - Code
AnyV2V: A Tuning-Free Framework For Any Video-to-Video Editing Tasks	TMLR 2024	Mar 21 2024	Paper – Project – Code
ReVideo: Remake a Video with Motion and Content Control	NeurIPS 2024	May 22 2024	— Paper - Project - Code
DIVE: Taming DINO for Subject-Driven Video Editing	arXiv	Dec 4 2024	Paper – Project
DreamInsert: Zero-Shot Image-to-Video Object Insertion from A Single Image	arXiv	Mar 13 2025	Paper
Get In Video: Add Anything You Want to the Video	arXiv	May 2025	Project – Paper
Pix2Video: Video Editing using Image Diffusion	ICCV 2023	Mar 22 2023	Project – Paper
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control	arXiv	Mar 28 2025	Project – Paper
Lucy Edit: Open-Weight Text-Guided Video Editing	arXiv	Sep 18 2025	Paper - Github
OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models	arXiv	Sep 22 2025	Paper - Project - Code
ContextFlow: Training-Free Video Object Editing via Adaptive Context Enrichment	arXiv	Sep 22 2025	Paper - Project - Code
EditVerse: Unifying Image and Video Editing and Generation with In-Context Learning	arXiv	Sep 24 2025	Paper
IMAGEdit : Let Any Subject Transform	arXiv	Oct 01 2025	Paper - Project - Code
InstructX: Towards Unified Visual Editing with MLLM Guidance	arXiv	Oct 10 2025	Paper
In-Context Learning with Unpaired Clips for Instruction-based Video Editing	arXiv	Oct 16 2025	Paper - Code

🔥 Look-Driven Video Generation Models

Look: The unified visual baseline of a piece—covering style, color, and lighting, texture/grade, and any VFX choices, to achieve a consistent on-screen feel.

Title	Venue	Date	Links
VFXMaster: Unlocking Dynamic Visual Effect Generation via In-Context Learning	arXiv	Oct 29 2025	Paper – Project – Code
Video-As-Prompt: Unified Semantic Control for Video Generation	arXiv	Oct 28 2025	Paper – Project – Code
Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation	arXiv	Aug 11 2025	Paper – Project – Code
VFX Creator: Animated Visual Effect Generation with Controllable Diffusion Transformer	arXiv	Feb 09 2025	Paper – Project
StyleMaster: Stylize Your Video with Artistic Generation and Translation	CVPR 2025	Dec 10 2024	Paper – Project – Code

🧑 Human-Domain Personalized Video Generation Models

🎨 Identity-Driven Video Generation Models

Test-time Finetuning

Title	Venue	Date	Links
Magic-Me: Identity-Specific Video Customized Diffusion	arXiv	Mar 20 2024	Paper – Project – Code
ID-Animator: Zero-Shot Identity-Preserving Human Video Generation	arXiv	Apr 23 2024	Paper – Project – Code
PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation	ICCV 2025	Mar 16 2025	Paper – Project –Code
MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization	arXiv	Mar 16 2025	Paper – Project –Code

Pretrained Adaptation

Title	Venue	Date	Links
ConsisID: Identity-Preserving Text-to-Video Generation by Frequency Decomposition	CVPR 2025	Nov 26 2024	Paper – Code
AnchorCrafter: Animate CyberAnchors Saling Your Products via Human-Object Interacting Video Generation	arXiv	Nov 26 2024	Paper – Code
Ingredients: Blending Custom Photos with Video Diffusion Transformers	arXiv	Jan 3 2025	Paper – Code
Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers	ICCV 2025	Jan 7 2025	Paper – Code
EchoVideo: Identity-Preserving Human Video Generation by Multimodal Feature Fusion	arXiv	Jan 23 2025	Paper – Code
SkyReels-A1: Expressive Portrait Animation in Video Diffusion Transformers	arXiv	Feb 15 2025	Paper – Page - Code
Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts	CVPR 2025	Feb 4 2025	Paper – Page
FantasyID: Face Knowledge Enhanced ID-Preserving Video Generation	arXiv	Feb 25 2025	Paper – Project – Code
Concat-ID: Towards Universal Identity-Preserving Video Synthesis	arXiv	Mar 18 2025	Paper – Code
Proteus-ID: ID-Consistent and Motion-Coherent Video Customization	arXiv	Jun 30 2025	Paper – Project
From Large Angles to Consistent Faces: Identity-Preserving Video Generation via Mixture of Facial Experts	arXiv	Aug 13 2025	Paper - Code
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning	arXiv	Seq 10 2025	Paper - Code - Page
Lynx: Towards High-Fidelity Personalized Video Generation	arXiv	Seq 19 2025	Paper - Project
Stand-In: A Lightweight and Plug-and-Play Identity Control for Video Generation	arXiv	Aug 12 2025	Paper - Page - Code
Identity-GRPO: Optimizing Multi-Human Identity-preserving Video Generation via Reinforcement Learning	arXiv	Oct 17 2025	Paper - Page - Code
ID-Composer: Multi-Subject Video Synthesis with Hierarchical Identity Preservation	arXiv	Nov 1 2025	Paper
ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation	arXiv	Dec 8 2025	Paper - Github

Training-free

Title	Venue	Date	Links
BachVid: Training-Free Video Generation with Consistent Background and Character	arXiv	Oct 24 2025	Paper – Code
｜Scaling Zero-Shot Reference-to-Video Generation ｜ arXiv	Dec 7 2025	Paper - Code - Project｜

🎺 Audio-Driven Portrait Animation

Title	Venue	Date	Links
EMO: Emote Portrait Alive -- Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions	ECCV 2024	Feb 27 2024	Paper – Code – Page
EMO2: End-Effector Guided Audio-Driven Avatar Video Generation	ECCV 2024	Jan 18 2025	Paper
FantasyTalking: Realistic Talking Portrait Generation via Coherent Motion Synthesis	ACMMM 2025	Apr 07 2025	Paper - Project - Code
Let Them Talk: Audio-Driven Multi-Person Conversational Video Generation	arXiv	May 28 2025	Paper – Project - Code
SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers	arXiv	Jun 11 2025	Paper – Project - Code
InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions	arXiv	Jun 11 2025	Paper – Project
OmniAvatar: Efficient Audio-Driven Avatar Video Generation with Adaptive Body Animation	arXiv	Jun 23 2025	Paper – Project - Code
MirrorMe: Towards Realtime and High Fidelity Audio-Driven Halfbody Animation	arXiv	Jun 27 2025	Paper – Project
Democratizing High-Fidelity Co-Speech Gesture Video Generation	ICCV 2025	Jul 09 2025	Paper – Project - Code
StableAvatar: Infinite-Length Audio-Driven Avatar Video Generation	arXiv	Aug 11 2025	Paper – Project - Code
FantasyTalking2: Timestep-Layer Adaptive Preference Optimization for Audio-Driven Portrait Animation	arXiv	Aug 15 2025	Paper - Project
Kling-Avatar: Grounding Multimodal Instructions for Cascaded Long-Duration Avatar Animation Synthesis	arXiv	Sep 11 2025	Paper - Project
Input-Aware Sparse Attention for Real-Time Co-Speech Video Generation	Siggrapha Asia	Oct 2 2025	Paper - Project - Code｜
Paper2Video: Automatic Video Generation from Scientific Papers	arXiv	Oct 6 2025	Paper - Project - Code
Lookahead Anchoring: Preserving Character Identity in Audio-Driven Human Animation	arXiv	Oct 27 2025	Paper - Project - Code
Playmate2: Training-Free Multi-Character Audio-Driven Animation via Diffusion Transformer with Reward Feedback	AAAI	Oct 14 2025	Paper - Project - Code

🕺 Pose-Driven Human Animation

Title	Venue	Date	Links
Follow Your Pose: Pose-Guided Text-to-Video Generation using Pose-Free Videos	AAAI 2024	Apr 3 2023	Paper – Code – Page
DreamPose: Fashion Image-to-Video Synthesis via Stable Diffusion	ICCV 2023	Apr 12 2023	Paper – Code – Page
DisCo: Disentangled Control for Realistic Human Dance Generation	CVPR 2024	Jun 30 2023	Paper – Code – Page
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion	ICML 2024	Nov 18 2023	Paper – Code – Page
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model	CVPR 2024	Nov 27 2023	Paper – Code – Page
Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation	CVPR 2024	Nov 28 2023	Paper – Code – Page
Follow-Your-Pose v2: Multiple-Condition Guided Character Image Animation for Stable Pose Control	arXiv	Jun 05 2024	Paper – Page
MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance	ICML 2025	Jun 28 2024	Paper – Code – Page
MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling	CVPR 2025	Sep 24 2024	Paper – Code – Page
StableAnimator: High-Quality Identity-Preserving Human Image Animation	CVPR 2025	Sep 24 2024	Paper – Code – Page
DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses	ICCV 2025	Nov 30 2024	Paper – Code – Page
DisPose: Disentangling Pose Guidance for Controllable Human Image Animation	ICLR 2025	Dec 12 2024	Paper – Code - Page
Consistent Human Image and Video Generation with Spatially Conditioned Diffusion	arXiv	Dec 19 2024	Paper – Code
DirectorLLM for Human-Centric Video Generation	arXiv	Dec 19 2024	Paper
X-Dyna: Expressive Dynamic Human Image Animation	CVPR 2025 (Highlight)	Jan 17 2025	Paper – Page - Code
HumanDiT: Pose-Guided Diffusion Transformer for Long-form Human Motion Video Generation	arXiv	Feb 7 2025	Paper – Page
Animate Anyone 2: High-Fidelity Character Image Animation with Environment Affordance	arXiv	Feb 10 2025	Paper – Page
DreamActor-M1: Holistic, Expressive and Robust Human Image Animation with Hybrid Guidance	arXiv	Apr 20 2025	Paper – Page
TokenMotion: Decoupled Motion Control via Token Disentanglement for Human-centric Video Generation	CVPR 2025	Apr 11 2025	Paper
DanceTogether! Identity-Preserving Multi-Person Interactive Video Generation	arXiv	May 23 2025	Paper – Page - Code
StableAnimator++: Overcoming Pose Misalignment and Face Distortion for Human Image Animation	arXiv	Jul 20 2025	Paper – Page
Wan-Animate: Unified Character Animation and Replacement with Holistic Replication	arXiv	Seq 17 2025	Paper – Page
SteadyDancer: Harmonized and Coherent Human Image Animation with First-Frame Preservation	arXiv	Nov 24 2025	Paper – Page - Code

🎨 Video-Driven Facial Reenactment

Title	Venue	Date	Links
Follow-Your-Emoji: Fine-Controllable and Expressive Freestyle Portrait Animation	Siggraph Asia 2024	Jun 4 2024	Paper - Page - Code
Follow-Your-Emoji-Faster: Towards Efficient, Fine-Controllable, and Expressive Freestyle Portrait Animation	IJCV 2025	Seq 20, 2025	Paper - Page - Code

💼 Commercial Personalized Video Generation Models

📈 Datasets and Benchmarks

🌟 Personalized Video Generation Benchmarks

Title / Benchmark	Venue	Date	Links
ConsisID-Bench – 150 identities & 90 prompts (human-domain)	CVPR 2025 (Highlight)	Nov 2024	Project – Data
MSRVTT-Personalization (Alchemist-Bench) – Multi-subject personalization benchmark	CVPR 2025	Jan 2025	Paper – Data/Code
VACE-Benchmark – VACE: All-in-One Video Creation and Editing	arXiv 2025	Mar 2025	Paper – Data/Code
FullBench - FullDiT: Multi-Task Video Generative Foundation Model with Full Attention	arXiv	Mar 25 2025	Paper – Data
A2 Bench – “Elements-to-Video” evaluation benchmark for arbitrary subjects	arXiv	Apr 2025	Paper – Data/Code
OpenS2V-Eval – Fine-grained S2V benchmark (180 prompts, real & synthetic)	arXiv	May 28 2025	Paper – Project – Code
Proteus-Bench	arXiv	Jun 30 2025	Paper – Project

📂 Personalized Video Generation Datasets

Subject-to-Video Datasets

Title / Dataset	Venue	Date	Links
Ditto: Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset	Arxiv	Oct 2025	Paper – Project – Data
ConsisID-Data	CVPR 2025 (Highlight)	Oct 2024	Paper – Project – Data
Any2CapIns	Arxiv	Mar 2025	Paper – Project – Data
OpenS2V-5M	Arxiv	May 28 2025	Paper – Project – Data
Phantom-Data	Arxiv	Jun 23 2025	Paper – Project – Data
SpeakerVid-5M: A Large-Scale High-Quality Dataset for Audio-Visual Dyadic Interactive Human Generation	Arxiv	Jul 14 2025	Paper – Project – Data
TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation	Arxiv	Oct 8 2025	Paper – Project – Data

ID-Driven Creation Datasets

Title / Dataset	Venue	Date	Links
TalkVid: A Large-Scale Diversified Dataset for Audio-Driven Talking Head Synthesis	Arxiv 2025	Aug 2025	Paper – Project – Data
CustomConcept101	CVPR 2023	Dec 2023	Paper – Project – Data

Multi-Subject Disambiguation

Title / Dataset	Venue	Date	Links
Character Mixing for Video Generation	Arxiv 2025	Oct 06 2025	Paper – Project – Code

📏 Key Evaluation Metrics

Visual Quality: Aesthetic, FID, FVD
Motion Amplitude: Optical Flow
Motion Smoothness: Vbench, VMBench
Text Relevance: CLIP-Score, BLIP-Score, GmeScore
Subject Consistency: FaceSim, FaceSim-Cur, NexusScore
Subject Naturalness: NaturalScore