EATEN: Entity-aware Attention for Single Shot Visual Text Extraction
December 29, 2019 ยท View on GitHub
Accepted to ICDAR 2019 arxiv
Authors: He Guo, Xiameng Qin, Jiaming Liu, Junyu Han, Jingtuo Liu and Errui Ding
Abstract
This repository is designed to provide an open-source dataset for Visual Text Extraction.
Samples
Train ticket
Real images

Synthetic images
Some clean images

Some hard images

Passport
Some images

Some hard images

Business card

Downloads
The dataset can be downloaded through the following link:
baiduyun, PASSWORD: e4z1
Some details:
| scenes | number | size | Google Drive link |
|---|---|---|---|
| train ticket | 300k synth + 1.9 real | 13G | dataset_trainticket.tar |
| passport | 100k synth | 5.8G | dataset_passport.tar |
| business card | 200k synth | 19G | dataset_business.tar.0 dataset_business.tar.1 dataset_business.tar.2 dataset_business.tar.3 |
Limitations&&Todo
- [A large of training data]
Todo:- Use CycleGan or domain adaptation to synth data to train EATEN.
- Introduce datasets of STR to EATEN.
- [Generalization on complex scenes]
Todo:- Add bounding box annotations of ToIs to EATEN, such as 2019-ICCV-oral Towards Unconstrained End-to-End Text Spotting.
- [Engineering]
- Merge server decoder to one.
- parallel decoding.