open-world.md

January 30, 2025 · View on GitHub

Open World

(arXiv 2022.03) Open Set Recognition using Vision Transformer with an Additional Detection Head, [Paper]
(arXiv 2022.06) OOD Augmentation May Be at Odds with Open-Set Recognition, [Paper]
(arXiv 2022.07) Scaling Novel Object Detection with Weakly Supervised Detection Transformers, [Paper]
(arXiv 2022.09) Pre-training image-language transformers for open-vocabulary tasks, [Paper]
(arXiv 2022.10) Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario, [Paper]
(arXiv 2022.12) PROB: Probabilistic Objectness for Open World Object Detection, [Paper], [Code]
(arXiv 2022.12) Open World DETR: Transformer based Open World Object Detection, [Paper]
(arXiv 2023.01) CAT: LoCalization and IdentificAtion Cascade Detection Transformer for Open-World Object Detection, [Paper], [Code]
(arXiv 2023.03) Prompt-Guided Transformers for End-to-End Open-Vocabulary Object Detection, [Paper]
(arXiv 2023.05) Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers, [Paper]
(arXiv 2023.08) SegPrompt: Boosting Open-World Segmentation via Category-level Prompt Learning, [Paper], [Code]
(arXiv 2023.09) Contrastive Feature Masking Open-Vocabulary Vision Transformer, [Paper]
(arXiv 2023.09) Diffusion Model is Secretly a Training-free Open Vocabulary er, [Paper]
(arXiv 2023.09) Unsupervised Open-Vocabulary Object Localization in Videos, [Paper], [Code]
(arXiv 2023.10) CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection, [Paper], [Code]
(arXiv 2023.11) Enhancing Novel Object Detection via Cooperative Foundational Models, [Paper], [Code]
(arXiv 2023.11) Language-conditioned Detection Transformer, [Paper], [Code]
(arXiv 2023.12) Learning Pseudo-Labeler beyond Noun Concepts for Open-Vocabulary Object Detection, [Paper]
(arXiv 2023.12) Boosting Segment Anything Model Towards Open-Vocabulary Learning, [Paper], [Code]
(arXiv 2023.12) Open World Object Detection in the Era of Foundation Models, [Paper], [Code]
(arXiv 2024.02) Semi-supervised Open-World Object Detection, [Paper], [Code]
(arXiv 2024.03) CLIP-VIS: Adapting CLIP for Open-Vocabulary Video Instance Segmentation, [Paper], [Code]
(arXiv 2024.03) OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation, [Paper]
(arXiv 2024.04) DetCLIPv3: Towards Versatile Generative Open-vocabulary Object Detection, [Paper]
(arXiv 2024.04) OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery, [Paper]
(arXiv 2024.05) OV-DQUO: Open-Vocabulary DETR with Denoising Text Query Training and Open-World Unknown Objects Supervision, [Paper]
(arXiv 2024.07) Unified Embedding Alignment for Open-Vocabulary Video Instance Segmentation, [Paper], [Code]
(arXiv 2024.09) FrozenSeg: Harmonizing Frozen Foundation Models for Open-Vocabulary Segmentation, [Paper], [Code]
(arXiv 2024.09) Mamba-YOLO-World: Marrying YOLO-World with Mamba for Open-Vocabulary Detection，[Paper], [Code]
(arXiv 2025.01) Modulating CNN Features with Pre-Trained ViT Representations for Open-Vocabulary Object Detection，[Paper]