Data Preparation

April 9, 2025 · View on GitHub

In this study, we clean the pre-training dataset by filtering out corrupted images and removing large black borders from the images.

Once downloaded the dataset, please change the data roots in data_utils/data_path.py.