For DiTs

March 14, 2025 · View on GitHub

cd generation/DiT

Training

sh train_dit.sh

Note:

${model}: "DiT-XL/2-VAE-simple", "DiT-L/2-VAE-simple", "DiT-B/2-VAE-simple".
During training, sampling occurs every ${eval-every} steps, and the results are saved as a NPZ file for evaluation. You can also use the script below to sample any saved fine-tuned weights.
The default ${global-batch-size} is 256.

Infer

sh sample_dit.sh

Sampling using the weights saved during the fine-tuning process and saving them as an NPZ file, which can be used for evaluating metrics.

Eval

sh eval_dit.sh

Note:

Same as DiT, we use ADM's TensorFlow evaluation suite to calculate FID, Inception Score and other metrics.
VIRTUAL_imagenet256_labeled.npz can be downloaded from ADM's TensorFlow evaluation suite

For SiTs

cd generation/SiT

Training

sh train_sit.sh

Note:

${model}: "SiT-XL/2-VAE-simple", "SiT-B/2-VAE-simple".
During training, sampling occurs every ${eval-every} steps, and the results are saved as a NPZ file for evaluation. You can also use the script below to sample any saved fine-tuned weights.
The default ${global-batch-size} is 256.

Infer

sh sample_sit.sh

Sampling using the weights saved during the fine-tuning process and saving them as an NPZ file, which can be used for evaluating metrics.

Eval