Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis
October 2, 2025 ยท View on GitHub
๐ ICCV 2025 | Official Code Release
This repository hosts the official implementation of our ICCV 2025 paper:
"Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis"
๐ฅ Up to 2ร speedup for high-res image synthesis with minimal quality drop.
๐ Abstract
Visual autoregressive modeling, based on the next-scale prediction paradigm, generates images by progressively refining resolution across multiple stages. However, the computational overhead in high-resolution stages remains a challenge due to the large number of tokens.
We introduce SparseVAR, a plug-and-play acceleration framework that dynamically excludes low-frequency tokens during inference with no extra training. On Infinity-2B, SparseVAR achieves up to 2ร speedup with minimal quality degradation.
๐ผ๏ธ Method
๐ Method Diagram:

๐ก Highlights
- โ No retraining required
- โก Dynamic skipping of low-frequency tokens
- ๐งฉ Compatible with Infinity and HART
- ๐ Up to 2ร faster high-resolution inference
๐ฆ Installation
git clone https://github.com/Caesarhhh/SparseVAR.git
cd SparseVAR
pip install -r requirements.txt
๐ Main Repository Structure
SparseVAR/
โโโ infinity/ # Infinity integration
โ โโโ scripts/
โ โ โโโ eval_sparsevar.sh
โ โ โโโ eval_baseline.sh
โ โโโ weights/ # place Infinity weights here
โ โโโ evaluation/ # evaluation configs & data
โ โโโ cus_datasets/
โ
โโโ hart/ # HART integration
โ โโโ scripts/
โ โ โโโ eval_sparsevar.sh
โ โ โโโ eval_baseline.sh
โ โโโ weights/ # place HART weights here
โ โโโ evaluation/ # evaluation configs & data
โ โโโ cus_datasets/
โ
โโโ requirements.txt
โโโ assets/
โโโ method_exit.png
โ ๏ธ Usage: enter
infinity/orhart/folder and run evaluation scripts.
๐ Model Weights Setup
Infinity
- Download from Infinity repo:
infinity_2b_reg.pthinfinity_vae_d32_reg.pth
- Download Mask2Former:
- Place files into:
infinity/weights/infinity_2b_reg.pthinfinity/weights/infinity_vae_d32_reg.pthinfinity/weights/mask2former/mask2former_swin-s-p4-w7-224_lsj_8x2_50e_coco.pth
HART
- Download hart-0.7b-1024px
โ Place intohart/weights/ - Download Mask2Former:
mask2former_swin-s-p4-w7-224_lsj_8x2_50e_coco.pthโ Place intohart/weights/mask2former/
๐ Prepare Datasets for Evaluation
GenEval
- Repo: https://github.com/djghosh13/geneval
- Copy
prompts/andobject_names.txtinto:evaluation/gen_eval/
DPG-Bench
- Repo: https://github.com/TencentQQGYLab/ELLA
- Copy
dpg_bench/into:cus_datasets/dpg_bench/
โถ๏ธ Running Evaluation
Infinity
cd infinity
bash scripts/eval_sparsevar.sh # SparseVAR acceleration
bash scripts/eval_baseline.sh # Baseline
HART
cd hart
bash scripts/eval_sparsevar.sh
bash scripts/eval_baseline.sh
๐ Citation
@inproceedings{chen2025sparsevar,
title = {Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis},
author = {Chen, Zhuokun and Fan, Jugang and Yu, Zhuowei and Zhuang, Bohan and Tan, Mingkui},
booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
year = {2025}
}
๐ Acknowledgements
This repository is built upon and inspired by the excellent works:
We thank the authors and maintainers of these repositories for open-sourcing their code and models, which made this work possible.
๐ License
This repository is for academic research only. For Infinity and HART code/models, please follow their respective licenses.