ICCV 2025 论文和开源项目合集(Papers with Code)
July 6, 2025 · View on GitHub
ICCV 2025 Accepance Rate of 24% = 2699 / 11239
注1:欢迎各位大佬提交issue,分享ICCV 2025论文和开源项目!
注2:关于往年CV顶会论文以及其他优质CV论文和大盘点,详见: https://github.com/amusi/daily-paper-computer-vision
欢迎扫码加入【CVer学术交流群】,可以获取ICCV 2025等最前沿工作!这是最大的计算机视觉AI知识星球!每日更新,第一时间分享最新最前沿的计算机视觉、AIGC、扩散模型、多模态、深度学习、自动驾驶、医疗影像和遥感等方向的学习资料,快加入学起来!

【ICCV 2025 论文和开源代码目录】
- 3DGS(Gaussian Splatting)
- Agent)
- Avatars
- Backbone
- CLIP
- Mamba
- Embodied AI
- GAN
- GNN
- 多模态大语言模型(MLLM)
- 大语言模型(LLM)
- 世界模型(World Model)
- OCR
- NeRF
- DETR
- 扩散模型(Diffusion Models)
- ReID(重识别)
- 长尾分布(Long-Tail)
- Vision Transformer
- 视觉和语言(Vision-Language)
- 自监督学习(Self-supervised Learning)
- 数据增强(Data Augmentation)
- 目标检测(Object Detection)
- 异常检测(Anomaly Detection)
- 目标跟踪(Visual Tracking)
- 语义分割(Semantic Segmentation)
- 实例分割(Instance Segmentation)
- 全景分割(Panoptic Segmentation)
- 医学图像(Medical Image)
- 医学图像分割(Medical Image Segmentation)
- 视频目标分割(Video Object Segmentation)
- 视频实例分割(Video Instance Segmentation)
- 参考图像分割(Referring Image Segmentation)
- 图像抠图(Image Matting)
- 图像编辑(Image Editing)
- Low-level Vision
- 超分辨率(Super-Resolution)
- 去噪(Denoising)
- 去模糊(Deblur)
- 自动驾驶(Autonomous Driving)
- 3D点云(3D Point Cloud)
- 3D目标检测(3D Object Detection)
- 3D语义分割(3D Semantic Segmentation)
- 3D目标跟踪(3D Object Tracking)
- 3D语义场景补全(3D Semantic Scene Completion)
- 3D配准(3D Registration)
- 3D人体姿态估计(3D Human Pose Estimation)
- 3D人体Mesh估计(3D Human Mesh Estimation)
- 3D Visual Grounding(3D视觉定位)
- 医学图像(Medical Image)
- 图像生成(Image Generation)
- 视频生成(Video Generation)
- 3D生成(3D Generation)
- 视频理解(Video Understanding)
- 行为检测(Action Detection)
- 具身智能(Embodied AI)
- 文本检测(Text Detection)
- 知识蒸馏(Knowledge Distillation)
- 模型剪枝(Model Pruning)
- 图像压缩(Image Compression)
- 三维重建(3D Reconstruction)
- 深度估计(Depth Estimation)
- 轨迹预测(Trajectory Prediction)
- 车道线检测(Lane Detection)
- 图像描述(Image Captioning)
- 视觉问答(Visual Question Answering)
- 手语识别(Sign Language Recognition)
- 视频预测(Video Prediction)
- 新视点合成(Novel View Synthesis)
- Zero-Shot Learning(零样本学习)
- 立体匹配(Stereo Matching)
- 特征匹配(Feature Matching)
- 暗光图像增强(Low-light Image Enhancement)
- 场景图生成(Scene Graph Generation)
- 风格迁移(Style Transfer)
- 隐式神经表示(Implicit Neural Representations)
- 图像质量评价(Image Quality Assessment)
- 视频质量评价(Video Quality Assessment)
- 压缩感知(Compressive Sensing)
- 数据集(Datasets)
- 新任务(New Tasks)
- 其他(Others)
3DGS(Gaussian Splatting)
Agent
Avatars
Backbone
TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba
CLIP
Mamba
TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
- Project:https://tiger-ai-lab.github.io/Vamba/
- Paper:https://arxiv.org/abs/2503.11579
- Code:https://github.com/TIGER-AI-Lab/Vamba
Embodied AI
GAN
OCR
NeRF
DETR
Prompt
多模态大语言模型(MLLM)
FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers
- Paper: https://arxiv.org/abs/2501.16297
- Code: https://github.com/JiuTian-VL/JiuTian-FALCON
- Project: https://jiutian-vl.github.io/FALCON.github.io/
大语言模型(LLM)
World Model(世界模型)
Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning
- Project: https://yijun-yang.github.io/MeWM/
- Paper: https://arxiv.org/abs/2506.02327
- Code: https://github.com/scott-yjyang/MeWM
ReID(重识别)
扩散模型(Diffusion Models)
From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers
Vision Transformer
视觉和语言(Vision-Language)
目标检测(Object Detection)
异常检测(Anomaly Detection)
目标跟踪(Object Tracking)
医学图像(Medical Image)
Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning
- Project: https://yijun-yang.github.io/MeWM/
- Paper: https://arxiv.org/abs/2506.02327
- Code: https://github.com/scott-yjyang/MeWM
医学图像分割(Medical Image Segmentation)
自动驾驶(Autonomous Driving)
Where, What, Why: Towards Explainable Driver Attention Prediction
- Paper: https://arxiv.org/abs/2506.23088
- Code: https://github.com/yuchen2199/Explainable-Driver-Attention-Prediction
- Project: https://github.com/yuchen2199/Explainable-Driver-Attention-Prediction
ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive Through Work Zones
- Paper: https://arxiv.org/abs/2406.07661
- Code: https://github.com/anuragxel/roadwork-dataset
- Project: https://www.cs.cmu.edu/~ILIM/roadwork_dataset/
DriveMM: All-in-One Large Multimodal Model for Autonomous Driving
- Project: https://zhijian11.github.io/DriveMM/
- Paper: https://arxiv.org/abs/2412.07689
- Code: https://github.com/zhijian11/DriveMM
3D点云(3D-Point-Cloud)
3D目标检测(3D Object Detection)
3D语义分割(3D Semantic Segmentation)
Low-level Vision
EAMamba: Efficient All-Around Vision State Space Model for Image Restoration
超分辨率(Super-Resolution)
去噪(Denoising)
图像去噪(Image Denoising)
3D人体姿态估计(3D Human Pose Estimation)
#3D Visual Grounding(3D视觉定位)
图像生成(Image Generation)
DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models
视频生成(Video Generation)
图像编辑(Image Editing)
Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing
- Project: https://eff-edit.github.io
- Paper: https://arxiv.org/abs/2503.10270
- Code: https://github.com/yuriYanZeXuan/EEdit
视频编辑(Video Editing)
3D生成(3D Generation)
3D重建(3D Reconstruction)
人体运动生成(Human Motion Generation)
视频理解(Video Understanding)
Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers
- Project:https://tiger-ai-lab.github.io/Vamba/
- Paper:https://arxiv.org/abs/2503.11579
- Code:https://github.com/TIGER-AI-Lab/Vamba
具身智能(Embodied AI)
知识蒸馏(Knowledge Distillation)
深度估计(Depth Estimation)
立体匹配(Stereo Matching)
暗光图像增强(Low-light Image Enhancement)
图像压缩(Image Compression)](#IC)
场景图生成(Scene Graph Generation)
风格迁移(Style Transfer)
图像质量评价(Image Quality Assessment)
视频质量评价(Video Quality Assessment)
压缩感知(Compressive Sensing)
数据集(Datasets)
ROADWork Dataset: Learning to Recognize, Observe, Analyze and Drive Through Work Zones
- Paper: https://arxiv.org/abs/2406.07661
- Code: https://github.com/anuragxel/roadwork-dataset
- Project: https://www.cs.cmu.edu/~ILIM/roadwork_dataset/
其他(Others)
Music Grounding by Short Video
- Project: https://rucmm.github.io/VMMR/
- Paper: https://arxiv.org/abs/2408.16990
- Code link: https://github.com/xxayt/MGSV