(1) Synthetic Scene Text Dataset from 3D World

August 6, 2020 ยท View on GitHub

LanuagesNum of ImagesNum of TextBaidu DriveGoogle Drive
English/Latin728K~20MLink password: 2h8dLink
Multilingual674K~18MLink password: tddlLink

The multilingual version consists of the following 10 languages: Arabic, English, French, Chinese, German, Korean, Japanese, Italian, Bangla, Hindi

Both datasets are very large (~150GB). Therefore, I split them into "several" files (~130). They are organzied as follows:

./
+---sub_0
    +---imgs
    |   0.jpg
    |   1.jpg
    |   ...
    |
    +---labels
    |   0.json
    |   1.json
    |   ...
    |
+---sub_1
+---sub_2
+---sub_3
...
+---sub_100
...

The labels are stored in the following format:

{
    "imgfile":str path to the corresponding image file, e.g. "imgs/0.jpg",
    "bbox": List[
                word_i(8 float):[x0, y0, x1, y1, x2, y2, x3, x4] 
                (from upper left corner, clockwise),
            ],
    "cbox": List[
                char_i(8 float):[x0, y0, x1, y1, x2, y2, x3, x4] 
                (from upper left corner, clockwise),
            ],
    "text": List[str]
}

Note that there may be a very small proportion of wrong labels. They are caused by the defects in some scene models. These wrong samples are characterized by very small sizes. You can discard these samples by filtering out word boxes that are less than 10 pixels high.

(2) Demo UE Project(s)

Scene NameBaidu DriveGoogle Drive
Realistic RenderingLink password: wgjaLink

How-to:

  1. download and uncompress the project
  2. in UE4.22, load the following file: Demo/Demo.uproject

(3) UnrealText resources

ResourcesBaidu DriveGoogle Drive
background imagesLink password: 3x3rLink
fonts & corpusLink password: ip8wLink

(4) Packaged Scene Executables

ScenesBaidu DriveGoogle Drive
All 30 scene executablesLink password: br31Link

How-to:

  1. download and uncompress the project
  2. cd to $Name/$Name/Binaries/Linux/, and double-click the executable ./Demo
  3. alternatively, you can launch it in terminal, ./$Name/$Name/Binaries/Linux/Demo