| Video-to-Video Synthesis | NIPS | code | 4749 |
| Deep Image Prior | CVPR | code | 3451 |
| StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation | CVPR | code | 3104 |
| Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network | ECCV | code | 2109 |
| Learning to See in the Dark | CVPR | code | 2033 |
| Glow: Generative Flow with Invertible 1x1 Convolutions | NIPS | code | 1862 |
| Squeeze-and-Excitation Networks | CVPR | code | 1263 |
| Efficient Neural Architecture Search via Parameters Sharing | ICML | code | 1189 |
| Multimodal Unsupervised Image-to-image Translation | ECCV | code | 1183 |
| Non-Local Neural Networks | CVPR | code | 859 |
| Image Generation From Scene Graphs | CVPR | code | 772 |
| Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet? | CVPR | code | 690 |
| Single-Shot Refinement Neural Network for Object Detection | CVPR | code | 668 |
| GANimation: Anatomically-aware Facial Animation from a Single Image | ECCV | code | 628 |
| Detect-and-Track: Efficient Pose Estimation in Videos | CVPR | code | 549 |
| Relation Networks for Object Detection | CVPR | code | 532 |
| PointCNN | NIPS | code | 506 |
| Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples | ICML | code | 491 |
| Simple Baselines for Human Pose Estimation and Tracking | ECCV | code | 488 |
| Taskonomy: Disentangling Task Transfer Learning | CVPR | code | 453 |
| Which Training Methods for GANs do actually Converge? | ICML | code | 453 |
| Cascaded Pyramid Network for Multi-Person Pose Estimation | CVPR | code | 447 |
| Pelee: A Real-Time Object Detection System on Mobile Devices | NIPS | code | 441 |
| Generative Image Inpainting With Contextual Attention | CVPR | code | 441 |
| Neural 3D Mesh Renderer | CVPR | code | 436 |
| Look at Boundary: A Boundary-Aware Face Alignment Algorithm | CVPR | code | 416 |
| Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs | CVPR | code | 412 |
| End-to-End Recovery of Human Shape and Pose | CVPR | code | 388 |
| In-Place Activated BatchNorm for Memory-Optimized Training of DNNs | CVPR | code | 388 |
| ICNet for Real-Time Semantic Segmentation on High-Resolution Images | ECCV | code | 372 |
| The Unreasonable Effectiveness of Deep Features as a Perceptual Metric | CVPR | code | 360 |
| Distractor-aware Siamese Networks for Visual Object Tracking | ECCV | code | 350 |
| Frustum PointNets for 3D Object Detection From RGB-D Data | CVPR | code | 346 |
| Efficient Interactive Annotation of Segmentation Datasets With Polygon-RNN++ | CVPR | code | 339 |
| Gibson Env: Real-World Perception for Embodied Agents | CVPR | code | 332 |
| Transparency by Design: Closing the Gap Between Performance and Interpretability in Visual Reasoning | CVPR | code | 309 |
| Soccer on Your Tabletop | CVPR | code | 308 |
| Noise2Noise: Learning Image Restoration without Clean Data | ICML | code | 304 |
| GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose | CVPR | code | 301 |
| GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation | CVPR | code | 301 |
| Neural Baby Talk | CVPR | code | 292 |
| Acquisition of Localization Confidence for Accurate Object Detection | ECCV | code | 285 |
| The Lovász-Softmax Loss: A Tractable Surrogate for the Optimization of the Intersection-Over-Union Measure in Neural Networks | CVPR | code | 283 |
| PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume | CVPR | code | 283 |
| Fast End-to-End Trainable Guided Filter | CVPR | code | 274 |
| Adversarially Regularized Autoencoders | ICML | code | 261 |
| License Plate Detection and Recognition in Unconstrained Scenarios | ECCV | code | 258 |
| Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors | CVPR | code | 257 |
| Supervising Unsupervised Learning | NIPS | code | 255 |
| Pyramid Stereo Matching Network | CVPR | code | 250 |
| Convolutional Neural Networks With Alternately Updated Clique | CVPR | code | 250 |
| Deep Photo Enhancer: Unpaired Learning for Image Enhancement From Photographs With GANs | CVPR | code | 241 |
| Neural Relational Inference for Interacting Systems | ICML | code | 240 |
| Learning to Adapt Structured Output Space for Semantic Segmentation | CVPR | code | 239 |
| An intriguing failing of convolutional neural networks and the CoordConv solution | NIPS | code | 230 |
| Learning to Segment Every Thing | CVPR | code | 227 |
| LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation | CVPR | code | 223 |
| End-to-End Learning of Motion Representation for Video Understanding | CVPR | code | 222 |
| Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images | ECCV | code | 219 |
| Bilinear Attention Networks | NIPS | code | 216 |
| Iterative Visual Reasoning Beyond Convolutions | CVPR | code | 213 |
| Semi-Parametric Image Synthesis | CVPR | code | 213 |
| A Style-Aware Content Loss for Real-time HD Style Transfer | ECCV | code | 201 |
| Style Aggregated Network for Facial Landmark Detection | CVPR | code | 192 |
| Pose-Robust Face Recognition via Deep Residual Equivariant Mapping | CVPR | code | 189 |
| GraphRNN: Generating Realistic Graphs with Deep Auto-regressive Models | ICML | code | 186 |
| Referring Relationships | CVPR | code | 185 |
| MoCoGAN: Decomposing Motion and Content for Video Generation | CVPR | code | 184 |
| Compressed Video Action Recognition | CVPR | code | 180 |
| LayoutNet: Reconstructing the 3D Room Layout From a Single RGB Image | CVPR | code | 178 |
| ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation | ECCV | code | 176 |
| Latent Alignment and Variational Attention | NIPS | code | 172 |
| Multi-Content GAN for Few-Shot Font Style Transfer | CVPR | code | 170 |
| SPLATNet: Sparse Lattice Networks for Point Cloud Processing | CVPR | code | 166 |
| Attentive Generative Adversarial Network for Raindrop Removal From a Single Image | CVPR | code | 158 |
| Single View Stereo Matching | CVPR | code | 158 |
| Unsupervised Feature Learning via Non-Parametric Instance Discrimination | CVPR | code | 156 |
| An End-to-End TextSpotter With Explicit Alignment and Attention | CVPR | code | 156 |
| Social GAN: Socially Acceptable Trajectories With Generative Adversarial Networks | CVPR | code | 154 |
| ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing | CVPR | code | 153 |
| Evolved Policy Gradients | NIPS | code | 151 |
| Optimizing Video Object Detection via a Scale-Time Lattice | CVPR | code | 150 |
| Large-Scale Point Cloud Semantic Segmentation With Superpoint Graphs | CVPR | code | 150 |
| Learning Category-Specific Mesh Reconstruction from Image Collections | ECCV | code | 146 |
| Group Normalization | ECCV | code | 145 |
| DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks | CVPR | code | 142 |
| MegaDepth: Learning Single-View Depth Prediction From Internet Photos | CVPR | code | 142 |
| ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices | CVPR | code | 142 |
| Deep Clustering for Unsupervised Learning of Visual Features | ECCV | code | 139 |
| BSN: Boundary Sensitive Network for Temporal Action Proposal Generation | ECCV | code | 139 |
| Learning a Single Convolutional Super-Resolution Network for Multiple Degradations | CVPR | code | 139 |
| Facelet-Bank for Fast Portrait Manipulation | CVPR | code | 138 |
| Image Super-Resolution Using Very Deep Residual Channel Attention Networks | ECCV | code | 137 |
| ECO: Efficient Convolutional Network for Online Video Understanding | ECCV | code | 137 |
| PlaneNet: Piece-Wise Planar Reconstruction From a Single RGB Image | CVPR | code | 137 |
| Self-Imitation Learning | ICML | code | 136 |
| Residual Dense Network for Image Super-Resolution | CVPR | code | 134 |
| Embodied Question Answering | CVPR | code | 132 |
| Unsupervised Cross-Dataset Person Re-Identification by Transfer Learning of Spatial-Temporal Patterns | CVPR | code | 131 |
| Two-Stream Convolutional Networks for Dynamic Texture Synthesis | CVPR | code | 131 |
| Densely Connected Pyramid Dehazing Network | CVPR | code | 130 |
| Camera Style Adaptation for Person Re-Identification | CVPR | code | 128 |
| Neural Motifs: Scene Graph Parsing With Global Context | CVPR | code | 127 |
| Weakly and Semi Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer | CVPR | code | 125 |
| Relational recurrent neural networks | NIPS | code | 124 |
| LSTM Pose Machines | CVPR | code | 124 |
| SO-Net: Self-Organizing Network for Point Cloud Analysis | CVPR | code | 123 |
| Image-Image Domain Adaptation With Preserved Self-Similarity and Domain-Dissimilarity for Person Re-Identification | CVPR | code | 121 |
| Context Embedding Networks | CVPR | code | 120 |
| Fast and Accurate Online Video Object Segmentation via Tracking Parts | CVPR | code | 119 |
| Cross-Domain Weakly-Supervised Object Detection Through Progressive Domain Adaptation | CVPR | code | 119 |
| Learning to Compare: Relation Network for Few-Shot Learning | CVPR | code | 118 |
| Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining | ECCV | code | 116 |
| Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships | CVPR | code | 116 |
| MVSNet: Depth Inference for Unstructured Multi-view Stereo | ECCV | code | 116 |
| Weakly Supervised Instance Segmentation Using Class Peak Response | CVPR | code | 116 |
| L4: Practical loss-based stepsize adaptation for deep learning | NIPS | code | 116 |
| A Closer Look at Spatiotemporal Convolutions for Action Recognition | CVPR | code | 115 |
| Unsupervised Learning of Monocular Depth Estimation and Visual Odometry With Deep Feature Reconstruction | CVPR | code | 114 |
| Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling | CVPR | code | 114 |
| MultiPoseNet: Fast Multi-Person Pose Estimation using Pose Residual Network | ECCV | code | 113 |
| Gated Path Planning Networks | ICML | code | 113 |
| PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning | CVPR | code | 110 |
| Decoupled Networks | CVPR | code | 109 |
| Video Based Reconstruction of 3D People Models | CVPR | code | 109 |
| CosFace: Large Margin Cosine Loss for Deep Face Recognition | CVPR | code | 109 |
| DeepMVS: Learning Multi-View Stereopsis | CVPR | code | 108 |
| Hierarchical Imitation and Reinforcement Learning | ICML | code | 107 |
| Real-Time Seamless Single Shot 6D Object Pose Prediction | CVPR | code | 107 |
| Adaptive Affinity Fields for Semantic Segmentation | ECCV | code | 107 |
| Long-term Tracking in the Wild: a Benchmark | ECCV | code | 106 |
| Realistic Evaluation of Deep Semi-Supervised Learning Algorithms | NIPS | code | 106 |
| Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics | CVPR | code | 104 |
| Deep Back-Projection Networks for Super-Resolution | CVPR | code | 104 |
| 3D-CODED: 3D Correspondences by Deep Deformation | ECCV | code | 102 |
| Recovering Realistic Texture in Image Super-Resolution by Deep Spatial Feature Transform | CVPR | code | 102 |
| Scale-Recurrent Network for Deep Image Deblurring | CVPR | code | 101 |
| PU-Net: Point Cloud Upsampling Network | CVPR | code | 101 |
| Noisy Natural Gradient as Variational Inference | ICML | code | 100 |
| Domain Adaptive Faster R-CNN for Object Detection in the Wild | CVPR | code | 99 |
| Rethinking Feature Distribution for Loss Functions in Image Classification | CVPR | code | 97 |
| DenseASPP for Semantic Segmentation in Street Scenes | CVPR | code | 97 |
| Quantized Densely Connected U-Nets for Efficient Landmark Localization | ECCV | code | 97 |
| Graph R-CNN for Scene Graph Generation | ECCV | code | 96 |
| Factoring Shape, Pose, and Layout From the 2D Image of a 3D Scene | CVPR | code | 94 |
| Density-Aware Single Image De-Raining Using a Multi-Stream Dense Network | CVPR | code | 93 |
| Deep Depth Completion of a Single RGB-D Image | CVPR | code | 93 |
| MAttNet: Modular Attention Network for Referring Expression Comprehension | CVPR | code | 92 |
| Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis | ICML | code | 91 |
| ELEGANT: Exchanging Latent Encodings with GAN for Transferring Multiple Face Attributes | ECCV | code | 89 |
| Neural Arithmetic Logic Units | NIPS | code | 87 |
| Perturbative Neural Networks | CVPR | code | 86 |
| Knowledge Aided Consistency for Weakly Supervised Phrase Grounding | CVPR | code | 86 |
| Repulsion Loss: Detecting Pedestrians in a Crowd | CVPR | code | 86 |
| End-to-End Weakly-Supervised Semantic Alignment | CVPR | code | 86 |
| Learning Blind Video Temporal Consistency | ECCV | code | 84 |
| PSANet: Point-wise Spatial Attention Network for Scene Parsing | ECCV | code | 84 |
| Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights | ECCV | code | 83 |
| Nonlinear 3D Face Morphable Model | CVPR | code | 81 |
| Deep Mutual Learning | CVPR | code | 80 |
| Image Inpainting for Irregular Holes Using Partial Convolutions | ECCV | code | 79 |
| BodyNet: Volumetric Inference of 3D Human Body Shapes | ECCV | code | 78 |
| Integral Human Pose Regression | ECCV | code | 77 |
| FSRNet: End-to-End Learning Face Super-Resolution With Facial Priors | CVPR | code | 77 |
| Attention-based Deep Multiple Instance Learning | ICML | code | 77 |
| LiDAR-Video Driving Dataset: Learning Driving Policies Effectively | CVPR | code | 77 |
| Multi-View Consistency as Supervisory Signal for Learning Shape and Pose Prediction | CVPR | code | 76 |
| Macro-Micro Adversarial Network for Human Parsing | ECCV | code | 76 |
| Multi-view to Novel view: Synthesizing novel views with Self-Learned Confidence | ECCV | code | 75 |
| LQ-Nets: Learned Quantization for Highly Accurate and Compact Deep Neural Networks | ECCV | code | 75 |
| Neural Kinematic Networks for Unsupervised Motion Retargetting | CVPR | code | 75 |
| Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking | CVPR | code | 75 |
| Synthesizing Images of Humans in Unseen Poses | CVPR | code | 74 |
| A PID Controller Approach for Stochastic Optimization of Deep Networks | CVPR | code | 74 |
| Tell Me Where to Look: Guided Attention Inference Network | CVPR | code | 74 |
| Multi-Scale Location-Aware Kernel Representation for Object Detection | CVPR | code | 73 |
| Recurrent Relational Networks | NIPS | code | 73 |
| VITON: An Image-Based Virtual Try-On Network | CVPR | code | 73 |
| VITAL: VIsual Tracking via Adversarial Learning | CVPR | code | 73 |
| Future Frame Prediction for Anomaly Detection – A New Baseline | CVPR | code | 72 |
| Recurrent Pixel Embedding for Instance Grouping | CVPR | code | 71 |
| Learning Human-Object Interactions by Graph Parsing Neural Networks | ECCV | code | 69 |
| Repeatability Is Not Enough: Learning Affine Regions via Discriminability | ECCV | code | 67 |
| Visual Feature Attribution Using Wasserstein GANs | CVPR | code | 67 |
| Avatar-Net: Multi-Scale Zero-Shot Style Transfer by Feature Decoration | CVPR | code | 66 |
| Learning SO(3) Equivariant Representations with Spherical CNNs | ECCV | code | 64 |
| Factorizable Net: An Efficient Subgraph-based Framework for Scene Graph Generation | ECCV | code | 64 |
| SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation | CVPR | code | 64 |
| ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans | CVPR | code | 64 |
| One-Shot Unsupervised Cross Domain Translation | NIPS | code | 62 |
| Pairwise Confusion for Fine-Grained Visual Classification | ECCV | code | 62 |
| Multi-Shot Pedestrian Re-Identification via Sequential Decision Making | CVPR | code | 62 |
| Generalizing A Person Retrieval Model Hetero- and Homogeneously | ECCV | code | 61 |
| Learning Depth From Monocular Videos Using Direct Methods | CVPR | code | 61 |
| Optimizing the Latent Space of Generative Networks | ICML | code | 60 |
| CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes | CVPR | code | 59 |
| “Zero-Shot” Super-Resolution Using Deep Internal Learning | CVPR | code | 59 |
| Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking | CVPR | code | 59 |
| PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition | CVPR | code | 58 |
| Progressive Neural Architecture Search | ECCV | code | 58 |
| Generative Neural Machine Translation | NIPS | code | 58 |
| Learning to Reweight Examples for Robust Deep Learning | ICML | code | 58 |
| Object Level Visual Reasoning in Videos | ECCV | code | 57 |
| Generate to Adapt: Aligning Domains Using Generative Adversarial Networks | CVPR | code | 57 |
| Improving Generalization via Scalable Neighborhood Component Analysis | ECCV | code | 57 |
| Geometry-Aware Learning of Maps for Camera Localization | CVPR | code | 57 |
| Path-Level Network Transformation for Efficient Architecture Search | ICML | code | 57 |
| Decorrelated Batch Normalization | CVPR | code | 57 |
| Ordinal Depth Supervision for 3D Human Pose Estimation | CVPR | code | 57 |
| Disentangled Person Image Generation | CVPR | code | 57 |
| Regularizing RNNs for Caption Generation by Reconstructing the Past With the Present | CVPR | code | 57 |
| Diverse Image-to-Image Translation via Disentangled Representations | ECCV | code | 56 |
| Pointwise Convolutional Neural Networks | CVPR | code | 56 |
| Neural Program Synthesis from Diverse Demonstration Videos | ICML | code | 56 |
| Learning Less Is More - 6D Camera Localization via 3D Surface Regression | CVPR | code | 55 |
| Unsupervised Domain Adaptation for 3D Keypoint Estimation via View Consistency | ECCV | code | 55 |
| Learning Latent Super-Events to Detect Multiple Activities in Videos | CVPR | code | 55 |
| Depth-aware CNN for RGB-D Segmentation | ECCV | code | 55 |
| Crafting a Toolchain for Image Restoration by Deep Reinforcement Learning | CVPR | code | 54 |
| Unsupervised Discovery of Object Landmarks as Structural Representations | CVPR | code | 54 |
| [ | | | |