Min-Max-Imagenet DiT
April 28, 2024 ยท View on GitHub
In a similar spirit to the Keller Jordan's Fastest CIFAR-10 training, I want to be the fastest diffusion trainer in the east. I'll keep the progress here. Currently very much WIP.
Featuring:
- DeepSpeed training of Diffusion Transformer. Supports Zero-1,2,3.
- CPU-offloaded, skipped EMA trick for Karras' Post-hoc EMA analysis, where you EMA once in every
Nsteps instead. You have to adjustbeta_1andbeta_2so they are properly accounting for the fact you skipped lastN-1steps. Of course, saving codes are there. - Featuring Streaming Dataset, specially my quantized imagenet.int8 for insanely lightweight imagenet training.
Dataset
Since this dataset is so small, you don't need to setup massive remote data setup stuff, just point to the local_dir, set remote_dir to None.
Running
For single-node setup, just
run.sh
Whats the goal here?
My goal is to get FID score of 30 under 20 hours of training. I'll keep updating this README as I make progress.