Datasets
August 21, 2022 ยท View on GitHub
Text
The textual data of each dataset in our benchmark can be found here.
Images
The raw images can be downloaded from the original websites:
We are trying to define Terms of Use that will allow us to collect and re-distribute all the images used in IGLUE in a single site.
Image Features
We also provide access to our processed image data (i.e. image features):
As the size of lmdb directories increased significantly upon upload,
we release the H5 (36 boxes, ResNet-101) and (compressed) directories of npy files (10-100 boxes, ResNeXt-101).
After donwloading them, you can convert them into LMDB format by executing the h5_to_lmdb or npy_to_lmdb scripts.
You can find the scripts for features extraction under features_extraction/.