ECCV-2024-Papers

December 9, 2024 · View on GitHub

Alt text

官网链接：https://eccv.ecva.net/

主会 :bell:：9 月 29 日（周日）至 10 月 4 日

历年综述论文分类汇总戳这里↘️CV-Surveys施工中~~~~~~~~~~

2025 年论文分类汇总戳这里

↘️WACV-2025-Papers ↘️CVPR-2025-Papers

2024 年论文分类汇总戳这里

↘️WACV-2024-Papers ↘️CVPR-2024-Papers ↘️ECCV-2024-Papers

2022 年论文分类汇总戳这里

2022 年论文分类汇总戳这里

2021 年论文分类汇总戳这里

2020 年论文分类汇总戳这里

💥💥💥全部论文已分类完毕

:thumbsup:ECCV 2024奖项公布，哥大摘得最佳论文奖桂冠

🏆Best Paper Award(最佳论文奖)

Minimalist Vision with Freeform Pixels
:house:project

🏅Best Paper Honorable Mention(最佳论文荣誉提名奖)

目录

:cat:	:dog:	:tiger:	:wolf:
1.Other(其它)	2.3D Visual	3.Face(人脸)	4.Pose(姿态估计)
5.OCR	6.Object Tracking(目标跟踪)	7.Object Detection(目标检测)	8.Super-Resolution(超分辨率)
9.Image Progress(图像/视频处理)	10.Image Classification(图像分类)	11.Image Segmentation(图像分割)	12.Image Retrieval(图像检索)
13.Image/video Compression(图像/视频压缩)	14.Image Captioning(图像/视频字幕)	15.GAN/Image Synthesis(图像生成)	16.Medical Image Progress(医学影响处理)
17.Video	18.Automated Driving(自动驾驶)	19.UAV/Remote Sensing/Satellite Image(无人机/遥感/卫星图像)	20.Scene
21.Vision-Language(视觉语言)	22.Few/Zero-Shot Learning/DG/A(小/零样本/域泛化/域适应)	23.Machine Learning(机器学习)	24.Vision Transformer
25.MC/KD/Pruning(模型压缩/知识蒸馏/剪枝)	26.NAS	27.GNN/GCN	28.Novel Class Discovery(新类发现)
29.Semi/self-supervised learning(半/自监督)	30.Anomaly Detection(异常检测)	31.Point Clouds(点云)	32.Person Re-Identification(人员重识别)
33.Motion Generation(人体运动生成)	34.Visual Question Answering(视觉问答)	35.Action Detection(动作检测)	36.Gaze Estimation
37.Style Transfer(风格迁移)	38.Human-Object Interaction(人机交互)	39.Robots(机器人)	40.Object Pose Estimation(物体姿态估计)
41.Biomedical(生物特征识别)	42.Optical Flow Estimation(光流估计)	43.Sound	44.Dataset/Benchmark(数据集/基准)
45.Neural Radiance Fields	46.Rendering(渲染)	47.Animal	48.Computer Graphics(计算机图形学)
49.Light-Field(光场)	50.Sketches(草图)	51.Feature Matching	52.Visual Entity Recognition(视觉实体识别)
53.Keypoint Detection(关键点检测)	54.Deepfake Detection	55.Information Security(信息安全)	56.Dense Prediction(密集预测)
57.Visual Relationship Detection(视觉关系检测)	58.全家桶

58.全家桶

X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-modal Reasoning
:star:code

57.Visual Relationship Detection(视觉关系检测)

56.Dense Prediction(密集预测)

55.Information Security(信息安全)

版权保护
- Rethinking Data Bias: Dataset Copyright Protection via Embedding Class-wise Hidden Bias
  :star:code保护数据集版权
图像水印

54.Deepfake Detection

53.Keypoint Detection(关键点检测)

52.Visual Entity Recognition(视觉实体识别)

Grounding Language Models for Visual Entity Recognition视觉实体识别

51.Feature Matching

50.Sketches(草图)

Do Generalised Classifiers really work on Human Drawn Sketches?

49.Light-Field(光场)

48.Computer Graphics(计算机图形学)

高动态范围成像
- SAFNet: Selective Alignment Fusion Network for Efficient HDR Imaging
  :star:code

47.Animal

46.Rendering(渲染)

45.Neural Radiance Fields

44.Dataset/Benchmark(数据集/基准)

43.Sound

Audio-Synchronized Visual Animation
:star:code
:house:project
Listen to Look into the Future: Audio-Visual Egocentric Gaze Anticipation
:house:project
Label-anticipated Event Disentanglement for Audio-Visual Video Parsing
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity
:star:code
Spherical World-Locking for Audio-Visual Localization in Egocentric Videos
Self-Supervised Audio-Visual Soundscape Stylization
:house:project
CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios
:star:code视听场景
Perceptual Evaluation of Audio-Visual Synchrony Grounded in Viewers’ Opinion Scores
Siamese Vision Transformers are Scalable Audio-visual Learners
:star:code视听学习器
Action2Sound: Ambient-Aware Generation of Action Sounds from Egocentric Videos
:house:project生成环境感知的动作声音
Audio-visual Generalized Zero-shot Learning the Easy Way
视听分割

42.Optical Flow Estimation(光流估计)

SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow
:star:code

41.Biomedical(生物特征识别)

Open-Set Biometrics: Beyond Good Closed-Set Models
:star:code

40.Object Pose Estimation(物体姿态估计)

39.Robots(机器人)

38.Human-Object Interaction(人机交互)

37.Style Transfer(风格迁移)

36.Gaze Estimation

35.Action Detection(动作检测)

34.Visual Question Answering(视觉问答)

33.Motion Generation(人体运动生成)

Event-Based Motion Magnification
:star:code
Learning-based Axial Video Motion Magnification
:house:project
SMooDi: Stylized Motion Diffusion Model
:star:code
Length-Aware Motion Synthesis via Latent Diffusion
:star:code
HUMOS: Human Motion Model Conditioned on Body Shape
:star:code
HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
:star:code
Generating Physically Realistic and Directable Human Motions from Multi-Modal Inputs
:house:project
Generating Human Interaction Motions in Scenes with Text Control
:house:project运动生成
Motion Mamba: Efficient and Long Sequence Motion Generation
:star:code
:house:project
Large Motion Model for Unified Multi-Modal Motion Generation
:house:project
EMDM: Efficient Motion Diffusion Model for Fast and High-Quality Motion Generation
:star:code
:house:project
Bridging the Gap Between Human Motion and Action Semantics via Kinematics Phrases
:house:project人体运动
TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos
:house:project人体运动
Nymeria: A Massive Collection of Egocentric Multi-modal Human Motion in the Wild人体运动
FreeMotion: MoCap-Free Human Motion Synthesis with Multimodal Large Language Models
MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model
:star:code
Realistic Human Motion Generation with Cross-Diffusion Models
:house:project人体运动
CoMo: Controllable Motion Generation through Language Guided Pose Code Editing
:house:project生成可控运动
TLControl: Trajectory and Language Control for Human Motion Synthesis
:house:project人体运动合成
Retrieval Robust to Object Motion Blur
:star:[code]((https://github.com/Rong-Zou/Retrieval-Robust-to-Object-Motion-Blur)
三维人体运动合成
- ReMoS: 3D Motion-Conditioned Reaction Synthesis for Two-Person Interactions
  :house:project
文本-动作合成
- FreeMotion: A Unified Framework for Number-free Text-to-Motion Synthesis
- Local Action-Guided Motion Diffusion Model for Text-to-Motion Generation
  :star:code
- Plan, Posture and Go: Towards Open-vocabulary Text-to-Motion Generation
  :house:project
- ParCo: Part-Coordinating Text-to-Motion Synthesis
  :star:code
人体运动预测
- Human Motion Forecasting in Dynamic Domain Shifts: A Homeostatic Continual Test-time Adaptation Framework人体运动预测
- Scene-aware Human Motion Forecasting via Mutual Distance Prediction
人体运动估计
- MANIKIN: Biomechanically Accurate Neural Inverse Kinematics for Human Motion Estimation
  :house:project
运动估计
- Motion-prior Contrast Maximization for Dense Continuous-Time Motion Estimation
  :star:code
- COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation
舞蹈生成
- Beat-It: Beat-Synchronized Multi-Condition 3D Dance Generation
  :house:project
行为生成
- DIM: Dyadic Interaction Modeling for Social Behavior Generation
  :star:code
运动迁移
- Temporal Residual Jacobians for Rig-free Motion Transfer
  :house:project
  🤗huggingface
运动预测
- Enhanced Motion Forecasting with Visual Relation Reasoning

32.Person Re-Identification(人员重识别)

31.Point Clouds(点云)

30.Anomaly Detection(异常检测)

29.Semi/self-supervised learning(半/自监督)

28.Novel Class Discovery(新类发现)

Self-Cooperation Knowledge Distillation for Novel Class Discovery

27.GNN/GCN

26.NAS

25.MC/KD/Pruning(模型压缩/知识蒸馏/剪枝)

24.Vision Transformer

23.Machine Learning(机器学习)

22.Few/Zero-Shot Learning/DG/A(小/零样本/域泛化/域适应)

21.Vision-Language(视觉语言)

20.Scene

19.UAV/Remote Sensing/Satellite Image(无人机/遥感/卫星图像)

18.Automated Driving(自动驾驶)

17.Video

16.Medical Image Progress(医学影响处理)

15.GAN/Image Synthesis(图像生成)

14.Image Captioning(图像/视频字幕)

13.Image/video Compression(图像/视频压缩)

12.Image Retrieval(图像检索)

11.Image Segmentation(图像分割)

10.Image Classification(图像分类)

Labeled Data Selection for Category Discovery
Active Generation for Image Classification
:star:code
Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs
:house:project
Dyn-Adapter: Towards Disentangled Representation for Efficient Visual Recognition
Wavelet Convolutions for Large Receptive Fields
:star:code
Momentum Auxiliary Network for Supervised Local Learning
:star:code
An accurate detection is not all you need to combat label noise in web-noisy datasets
:star:code
Dual-stage Hyperspectral Image Classification Model with Spectral Supertoken
:star:code
DEPICT: Diffusion-Enabled Permutation Importance for Image Classification Tasks
NOVUM: Neural Object Volumes for Robust Object Classification
:star:code
EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification
:star:code
Distribution-Aware Robust Learning from Long-Tailed Data with Noisy Labels
:star:code
Discovering Unwritten Visual Classifiers with Large Language Models
广义类别发现
- SelEx: Self-Expertise in Fine-Grained Generalized Category Discovery
  :star:code
- Textual Knowledge Matters: Cross-Modality Co-Teaching for Generalized Visual Class Discovery
  :star:code广义类别发现(Generalized Category Discovery,GCD)
- Learning to Distinguish Samples for Generalized Category Discovery
  :star:code
- PromptCCD: Learning Gaussian Mixture Prompt Pool for Continual Category Discovery
  :star:code
- Online Continuous Generalized Category Discovery
  :star:code广义类别发现
- Category Adaptation Meets Projected Distillation in Generalized Continual Category Discovery
  :star:code
多标签图像分类
- Distributionally Robust Loss for Long-Tailed Multi-Label Image Classification
  :star:code
小样本分类
- Benchmarking Spurious Bias in Few-Shot Image Classifiers
  :star:code
- Learning to Obstruct Few-Shot Image Classification over Restricted Classes
  :star:code
- Semantic-guided Robustness Tuning for Few-Shot Transfer Across Extreme Domain Shift小样本分类
零样本分类
- Online Zero-Shot Classification with CLIP
  :star:code
- IG Captioner: Information Gain Captioners are Strong Zero-shot Classifiers
多标签识别
- Modeling Label Correlations with Latent Context for Multi-Label Recognition
长尾识别
- LTRL: Boosting Long-tail Recognition via Reflective Learning
- Echoes of the Past: Boosting Long-tail Recognition via Reflective Learning
  :star:code
细粒度
- On Learning Discriminative Features from Synthesized Data for Self-Supervised Fine-Grained Visual Recognition
- A Rotation-invariant Texture ViT for Fine-Grained Recognition of Esophageal Cancer Endoscopic Ultrasound Images
  :star:code
- Adapting Fine-Grained Cross-View Localization to Areas without Fine Ground Truth

9.Image Progress(图像/视频处理)

8.Super-Resolution(超分辨率)

7.Object Detection(目标检测)

6.Object Tracking(目标跟踪)

5.OCR

Parrot Captions Teach CLIP to Spot Text
:star:code
WeCromCL: Weakly Supervised Cross-Modality Contrastive Learning for Transcription-only Supervised Text Spotting
FineMatch: Aspect-based Fine-grained Image and Text Mismatch Detection and Correction
:house:project
Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors
:star:code
手写文本检测
手写文本合成
- One-Shot Diffusion Mimicker for Handwritten Text Generation
  :star:code
- DiffusionPen: Towards Controlling the Style of Handwritten Text Generation
场景文本删除
- Leveraging Text Localization for Scene Text Removal via Text-aware Masked Image Modeling
  :star:code
文档理解
- Textual Grounding for Open-vocabulary Visual Information Extraction in Layout-diversified Documents
  :thumbsup:结合版式感知上下文学习和适用于文档的两阶段预训练，显著提高了模型对文档的理解能力
- VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding
  🤗huggingface密集文档理解
文本分割
- WAS: Dataset and Methods for Artistic Text Segmentation
  :star:code
文本合成
- Visual Text Generation in the Wild
  :star:code
文本修复
- LLMCO4MR: LLMs-aided Neural Combinatorial Optimization for Ancient Manuscript Restoration from Fragments with Case Studies on Dunhuang从碎片修复古代手稿以敦煌为例

4.Pose(姿态估计)

3.Face(人脸)

Task-adaptive Q-Face
Faceptor: A Generalist Model for Face Perception
:star:code
A Light Stage on Every Desk
:house:project
Face Adapter for Pre-Trained Diffusion Models with Fine-Grained ID and Attribute Control
:star:code
:house:project
ReSyncer: Rewiring Style-based Generator for Unified Audio-Visually Synced Facial Performer
:star:code
Facial Affective Behavior Analysis with Instruction Tuning
:star:code
:house:project
Arc2Face: A Foundation Model for ID-Consistent Human Faces
:star:code
:house:project
GAMMA-FACE: GAussian Mixture Models Amend Diffusion Models for Bias Mitigation in Face Images
GRAPE: Generalizable and Robust Multi-view Facial Capture
High-Quality Mesh Blendshape Generation from Face Videos via Neural Inverse Rendering
:star:code
人脸交换
- SelfSwapper: Self-Supervised Face Swapping via Shape Agnostic Masked AutoEncoder
  🤗huggingface
人脸模糊
- Forbes: Face Obfuscation Rendering via Backpropagation Refinement Scheme
  :star:code
人脸识别
- Towards Certifiably Robust Face Recognition
- AdaDistill: Adaptive Knowledge Distillation for Deep Face Recognition
  :star:code
- ARoFace: Alignment Robustness to Improve Low-Quality Face Recognition
  :star:code
- Personalized Privacy Protection Mask Against Unauthorized Facial Recognition
- MST-KD: Multiple Specialized Teachers Knowledge Distillation for Fair Face Recognition
- dversariaLeak: External Information Leakage Attack Using Adversarial Samples on Face Recognition Systems
人脸聚类
- VideoClusterNet: Self-Supervised and Adaptive Face Clustering for Videos人脸聚类
人脸重建
- Face Reconstruction Transfer Attack as Out-of-Distribution Generalization
  :star:code
人脸表情
- Norface: Improving Facial Expression Analysis by Identity Normalization
  :star:code
- Generalizable Facial Expression Recognition
  :star:code
- How Video Meetings Change Your Expression
  :house:project人脸
- AnimateMe: 4D Facial Expressions via Diffusion Models
人脸编辑
- GroupDiff: Diffusion-based Group Portrait Editing
  :star:code
- Real-time 3D-aware Portrait Editing from a Single Image
  :star:code肖像编辑
- Efficient 3D-Aware Facial Image Editing via Attribute-Specific Prompt Learning
  :star:code
人脸动画
- KMTalk: Speech-Driven 3D Facial Animation with Key Motion Embedding
  :star:code
- Fast Registration of Photorealistic Avatars for VR Facial Animation
  :house:project
- UniTalker: Scaling up Audio-Driven 3D Facial Animation through A Unified Model
  🤗huggingface
说话头合成
- ScanTalk: 3D Talking Heads from Unregistered Scans
  :star:code
- EDTalk: Efficient Disentanglement for Emotional Talking Head Synthesis
  :house:project头部合成
- EmoTalk3D: High-Fidelity Free-View Synthesis of Emotional 3D Talking Head
  :star:code
- All You Need is Your Voice: Emotional Face Representation with Audio Perspective for Emotional Talking Face Generation
  :star:code
- Audio-driven Talking Face Generation with Stabilized Synchronization Loss
- Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360°
  :star:code
- S^3D-NeRF: Single-Shot Speech-Driven Neural Radiance Field for High Fidelity Talking Head Synthesis
- Gaussian3Diff: 3D Gaussian Diffusion for 3D Full Head Synthesis and Editing
  :house:project头部合成
- Tri2-plane: Thinking Head Avatar via Feature Pyramid
  :house:project
- Avatar Fingerprinting for Authorized Use of Synthetic Talking-Head Videos
  :house:project
- TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
  :star:code3D 说话头合成
动画头部头像
- HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting
  :star:code动画头部头像
- HeadGaS: Real-Time Animatable Head Avatars via 3D Gaussian Splatting
- 3D Gaussian Parametric Head Model
  :star:code头部
人脸超分辨
- Kalman-Inspired Feature Propagation for Video Face Super-Resolution
  :house:project
人脸活体检测
- TF-FAS: Twofold-Element Fine-Grained Semantic Guidance for Generalizable Face Anti-Spoofing
  :star:code
  :thumbsup:通过双重元素细粒度语义指导来增强泛化能力
- DiffFAS: Face Anti-Spoofing via Generative Diffusion Models
  :star:code
- Towards Unified Representation of Invariant-Specific Features in Missing Modality Face Anti-Spoofing
- Bottom-Up Domain Prompt Tuning for Generalized Face Anti-Spoofing
头部合成
- Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer
  :star:code
  :house:project
- Loc3Diff: Local Diffusion for 3D Human Head Synthesis and Editing
情绪识别
- Upper-body Hierarchical Graph for Skeleton Based Emotion Recognition in Assistive Driving
  :star:code基于骨骼的情绪识别
人脸动作单元检测
- AUFormer: Vision Transformers are Parameter-Efficient Facial Action Unit Detectors
  :star:code
假脸检测
- Learning Natural Consistency Representation for Face Forgery Video Detection

2.3D Visual

1.Other(其它)

2020 年论文分类汇总戳这里

↘️CVPR-2020-Papers ↘️ECCV-2020-Papers

2021 年论文分类汇总戳这里

↘️ICCV-2021-Papers ↘️CVPR-2021-Papers

2022 年论文分类汇总戳这里

↘️CVPR-2022-Papers ↘️WACV-2022-Papers ↘️ECCV-2022-Papers

2023 年论文分类汇总戳这里

↘️CVPR-2023-Papers ↘️WACV-2023-Papers ↘️ICCV-2023-Papers

扫码CV君微信(注明：CVPR)入微信交流群：

9475fa20fd5e95235d9fa23ae9587a2 # ECCV-2024-Papers