Dlib-ImageNet-Datasets

May 8, 2025 ยท View on GitHub

Preprocessed Stable ImageNet-1K datasets for efficient computer vision prototyping with Dlib.

๐Ÿ“Œ Overview

This repository provides ready-to-use ImageNet-1K datasets preprocessed in multiple resolutions (32ร—32 to 256ร—256) for the Dlib machine learning library. Designed for rapid experimentation, benchmarking, and model training, these datasets eliminate preprocessing overhead while ensuring consistency across experiments.

๐Ÿš€ Immediately available:

  • A ready-to-use 32ร—32 resolution dataset (ideal for lightweight model prototyping) in the /dataset directory.

๐Ÿ› ๏ธ Flexible generation - The included C++14 tool lets you create custom datasets in any resolution (e.g., 64ร—64, 128ร—128, 224ร—224, etc.) from raw ImageNet-1K sources. Perfect for:

  • Rapid experimentation
  • Resolution-impact benchmarking
  • Consistent model training pipelines

๐Ÿ”ง Usage - The raw "Stable ImageNet-1K" images can be downloaded from:
โ†’ Kaggle Dataset

  1. Download and extract the compressed file locally
  2. Point the extraction root directory to our processing tool:
./create_dataset path/to/extracted_folder output_dataset.dat 224  # Example for 224x224

Create Custom Datasets

Compile and run the included tool to process raw ImageNet-1K images:

g++ -std=c++14 src/create_dataset.cpp -o create_dataset -ldlib -lpthread
./create_dataset path/to/imagenet_root datasets/128x128/imagenet_128.dat 128

Evaluate Models

Load pre-split training/testing sets:

std::vector<dlib::matrix<rgb_pixel>> train_images, test_images;
std::vector<unsigned long> train_labels, test_labels;
dlib::load_stable_imagenet_1k("datasets/64x64/imagenet_64.dat", train_images, train_labels, test_images, test_labels);