README.md

May 10, 2023 · View on GitHub

Deep Image Matting: A Comprehensive Survey

This is the official repository of the paper Deep Image Matting: A Comprehensive Survey.

Jizhizi Li, Jing Zhang, and Dacheng Tao1
1 The University of Sydney, Sydney, Australia

Introduction | Preliminary | Methods | Datasets | Benchmark | Statement

Introduction

Image matting refers to extracting precise alpha matte from natural images, and it plays a critical role in various downstream applications, such as image editing. The emergence of deep learning has revolutionized the field of image matting and given birth to multiple new techniques, including automatic, interactive, and referring image matting. Here we present a comprehensive review of recent advancements in image matting in the era of deep learning by focusing on two fundamental sub-tasks: auxiliary input-based image matting.

Preliminary

Image matting, which refers to the precise extraction of the soft matte from foreground objects in arbitrary images, has been extensively studied for several decades. The process can be described mathematically as below, where I represents the input image, F represents the foreground image, and B represents the background image. The opacity of the pixel in the foreground is denoted by αi, which ranges from 0 to 1. We also show the typical input image, ground truth alpha matte and various auxiliary inputs such as trimap, background, coarse map, user clicks, scribbles, and a text description in the following figure. The text description for this image can be the cute smiling brown dog in the middle of the image.

Image Matting Methods

We compile a timeline of the developments in deep learning-based image matting methods as follows.

We also list a summary of image matting methods organized according to the year of publication, the publication venue, input modality, automaticity, matting target, architecture, and availability of the code (with the link). The list of papers is chronologically ordered. Please note that [U] stands for the unofficial implementation of the code.

Year Method Pub. Input Auto. Target Arch. Code
2016 Deep automatic portrait matting (DAPM) ECCV RGB human Sequential two-step CNN -
Natural image matting using deep convolutional neural networks (DCNN) ECCV RGB-Coarse object One-stage CNN -
2017 Deep image matting (DIM) CVPR RGB-Trimap object One-stage CNN+Refine Github[U]
Fast deep matting for portrait animation on mobile phone (FDM) MM RGB human Sequantial two-step CNN -
2018 Tom-Net: Learning transparent object matting from a single image (TOM-Net) CVPR RGB trans. Sequential two-step CNN+Refine Github
Deep propagation based image matting (DMPN) IJCAI RGB-Trimap object One-stage CNN -
Alphagan: Generative adversarial networks for natural image matting (AlphaGAN) BMVC RGB-Trimap object One-stage GAN Github[U]
Semantic soft segmentation (SSS) TOG RGB object Sequential two-stage Github
Semantic human matting (SHM) MM RGB human Sequential two-step CNN Github[U]
Active matting (ActiveMatting) NeurIPS RGB-Click object One-stage RNN -
2019 A late fusion cnn for digital matting (LF) CVPR RGB object Sequential two-stage CNN Github
Learning-based sampling for natural image matting (SampleNet) CVPR RGB-Trimap object Parallel three-stream CNN -
Indices matter: Learning to index for deep image matting (IndexNet) ICCV RGB-Trimap object One-stage CNN Github
Disentangled image matting (AdaMatting) ICCV RGB-Trimap object Parallel two-stream CNN+refine -
Context-aware image matting for simultaneous foreground and alpha estimation (Context-Aware) ICCV RGB-Trimap object Two-stream CNN Github
2020 Natural image matting via guided contextual attention (GCA) AAAI RGB-Trimap object One-stage CNN Github
Background matting: The world is your green screen (BM) CVPR RGB-Bg human Parallel four-stream CNN Github
Hierarchical opacity propagation for image matting (HOP) arXiv RGB-Trimap object Parallel two-stream CNN Github
Boosting semantic human matting with coarse annotations (SHMC) CVPR RGB human Sequential two-stage CNN -
F, b, alpha matting (FBA) arXiv RGB-Trimap object One-stage CNN Github
Attention-guided hierarchical structure aggregation for image matting (HAtt) CVPR RGB object One-stage CNN -
High-resolution deep image matting (HDMatt) AAAI RGB-Trimap object Parallel two-stream CNN -
Bridging composite and real: towards end-to-end deep image matting (GFM) IJCV RGB human, animal Parallel two-stream CNN Github
Modnet: Real-time trimap-free portrait matting via objective decomposition (MODNet) AAAI RGB human Parallel two-stream CNN Github
Learning affinity-aware upsampling for deep image matting(A2U) CVPR RGB-Trimap object One-stage CNN Github
Mask guided matting via progressive refinement network (MGMatting) CVPR RGB-Coarse human One-stage CNN Github
Improved image matting via real-time user clicks and uncertainty estimation (InteractiveMatting) CVPR RGB-Click object Parallel two-stream CNN -
Smart scribbles for image matting (SmartScribbles) TOMM RGB-Scribble object One-stage CNN -
Real-Time High-Resolution Background Matting (BMV2) CVPR RGB-Bg human One-stage CNN+refine Github
2021 Towards enhancing fine-grained details for image matting (FDMatting) WACV RGB-Trimap object Two-stream CNN -
Semantic image matting (SIM) CVPR RGB-Trimap object One-stage CNN Github
Privacy-preserving portrait matting (P3M-Net) MM RGB human Parallel two-stream CNN Github
Cascade image matting with deformable graph refinement (CasDGR) ICCV RGB object Parallel two-stream CNN -
Deep Automatic Natural Image Matting (AIM-Net) IJCAI RGB object Parallel two-stream CNN Github
Long-range feature propagating for natural image matting (LFPNet) MM RGB-Trimap object Parallel two-stream CNN Github
Virtual Multi-Modality Self-Supervised Foreground Matting for Human-Object Interaction (VMFM) ICCV RGB human-object Sequential two-stage CNN -
Tripartite Information Mining and Integration for Image Matting (TIMI-Net) ICCV RGB-Trimap object Parallel three-stream CNN Github
Deep Image Matting with Flexible Guidance Input (FGI) BMVC RGB-Flexible object One-stage CNN Github
Highly efficient natural image matting (HEMatting) BMVC RGB object Sequential two-stage CNN -
2022 Boosting Robustness of Image Matting With Context Assembling and Strong Data Augmentation (Rmat) CVPR RGB-Trimap object Parallel two-stream CNN/Transformer -
Deep interactive image matting with feature propagation (DIIM) TIP RGB-Click object One-stage CNN -
User-Guided Deep Human Image Matting Using Arbitrary Trimaps (UGDMatting) TIP RGB-Flexible human Parallel two-stream CNN -
Image matting with deep gaussian process (matting-GP) TNNLS RGB-Trimap object One-stage CNN -
Rethinking portrait matting with privacy preserving (P3M-ViTAE) IJCV RGB human Parallel two stream CNN/Transformer Github
Situational Perception Guided Image Matting (SPG-IM) MM RGB object Sequential two-stage CNN -
Human instance matting via mutual guidance and multi-instance refinement (HIM) CVPR RGB human Sequential two-stage CNN Github
MatteFormer: Transformer-Based Image Matting via Prior-Tokens (MatteFormer) CVPR RGB-Trimap object One-stage CNN/Transformer Github
Referring image matting (RIM) CVPR RGB-Language object One-stage CNN Github
TransMatting: Enhancing Transparent Objects Matting with Transformers (TransMatting) ECCV RGB-Trimap trans. One-stage CNN/Transformer Github

Image Matting Datasets

We list a summary of the image matting datasets, categorized as the synthetic image-based benchmark, natural image-based benchmark, and test sets. The datasets are ordered based on their release date and are described in terms of publication venue, naturalness, matting target, resolution, number of training and test samples, and availability (along with their links). It should be noted that the size of the datasets is calculated based on the number of distinguished foregrounds, except for TOM and RefMatte, which have pre-defined composite rules.

NamePub.NaturalTargetResolution#Train#TestPublicity
DIM-481CVPR'17object1298×108343150Link
TOMCVPR'18transparent-178,000876Link
LF-257CVPR'19human553×75622829Link
HATT-646CVPR'20object1573×173159660Link
PhotoMatte13kCVPR'20human-13665--
SIMCVPR'21object2194×195034850Link
Human-2kICCV'21human2112×20752000100Link
Trans-460ECCV'22transparent3766×382041050Link
HIM2kCVPR'22human1823×14241500500Link
RefMatteCVPR'23object1543×1162450002500Link
AlphaMattingCVPR'09object3056×2340278Link
DAPM-2kECCV'16human600×8001700300Link
SHM-35kMM'18human-525111400-
SHMC-10kCVPR'20human-9324125-
P3M-10kMM'21human1349×132194211000Link
AM-2kIJCV'22animal1471×11951800200Link
Multi-Object-1kMM'22human-object-1000200-
UGD-12kTIP'22human356×31712066700Link
PhotoMatte85CVPR'20human2304×3456-85Link
AIM-500IJCAI'21object1397×1260-500Link
RWP-636CVPR'21human1038×1327-636Link
PPM-100AAAI'22human2997×2875-100Link

Performance Benchmarking

We provide a comprehensive evaluation of representative matting methods in the paper. Here, we present some subjective results of auxiliary-based matting methods on alphamatting.com and automatic matting methods on P3M-500-NP.

Statement

If you are interested in our work, please consider citing the following:

@article{li2023deep,
  title={Deep Image Matting: A Comprehensive Survey},
  author={Jizhizi Li and Jing Zhang and Dacheng Tao},
  journal={ArXiv},
  year={2023},
  volume={abs/2304.04672}
}

This project is under the MIT license. For further questions, please contact Jizhizi Li at jili8515@uni.sydney.edu.au.

Relevant Projects

[1] Deep Automatic Natural Image Matting, IJCAI, 2021 | Paper | Github
     Jizhizi Li, Jing Zhang, and Dacheng Tao

[2] Privacy-preserving Portrait Matting, ACM MM, 2021 | Paper | Github
     Jizhizi Li, Sihan Ma, Jing Zhang, Dacheng Tao

[3] Bridging Composite and Real: Towards End-to-end Deep Image Matting, IJCV, 2022 | Paper | Github
     Jizhizi Li, Jing Zhang, Stephen J. Maybank, Dacheng Tao

[4] Referring Image Matting, CVPR, 2023 | Paper | Github
     Jizhizi Li, Jing Zhang, and Dacheng Tao

[5] Rethinking Portrait Matting with Privacy Preserving, IJCV, 2023 | Paper | Github
     Sihan Ma, Jizhizi Li, Jing Zhang, He Zhang, Dacheng Tao