Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis

October 2, 2025 ยท View on GitHub

๐Ÿ“Œ ICCV 2025 | Official Code Release
This repository hosts the official implementation of our ICCV 2025 paper:
"Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis"
๐Ÿ”ฅ Up to 2ร— speedup for high-res image synthesis with minimal quality drop.


๐Ÿ” Abstract

Visual autoregressive modeling, based on the next-scale prediction paradigm, generates images by progressively refining resolution across multiple stages. However, the computational overhead in high-resolution stages remains a challenge due to the large number of tokens.
We introduce SparseVAR, a plug-and-play acceleration framework that dynamically excludes low-frequency tokens during inference with no extra training. On Infinity-2B, SparseVAR achieves up to 2ร— speedup with minimal quality degradation.


๐Ÿ–ผ๏ธ Method

๐Ÿ“„ Method Diagram:
Method Overview


๐Ÿ’ก Highlights

  • โœ… No retraining required
  • โšก Dynamic skipping of low-frequency tokens
  • ๐Ÿงฉ Compatible with Infinity and HART
  • ๐Ÿš€ Up to 2ร— faster high-resolution inference

๐Ÿ“ฆ Installation

git clone https://github.com/Caesarhhh/SparseVAR.git
cd SparseVAR
pip install -r requirements.txt

๐Ÿ“‚ Main Repository Structure

SparseVAR/
โ”œโ”€โ”€ infinity/                     # Infinity integration
โ”‚   โ”œโ”€โ”€ scripts/
โ”‚   โ”‚   โ”œโ”€โ”€ eval_sparsevar.sh
โ”‚   โ”‚   โ””โ”€โ”€ eval_baseline.sh
โ”‚   โ”œโ”€โ”€ weights/                  # place Infinity weights here
โ”‚   โ”œโ”€โ”€ evaluation/               # evaluation configs & data
โ”‚   โ””โ”€โ”€ cus_datasets/             
โ”‚
โ”œโ”€โ”€ hart/                         # HART integration
โ”‚   โ”œโ”€โ”€ scripts/
โ”‚   โ”‚   โ”œโ”€โ”€ eval_sparsevar.sh
โ”‚   โ”‚   โ””โ”€โ”€ eval_baseline.sh
โ”‚   โ”œโ”€โ”€ weights/                  # place HART weights here
โ”‚   โ”œโ”€โ”€ evaluation/               # evaluation configs & data
โ”‚   โ””โ”€โ”€ cus_datasets/             
โ”‚
โ”œโ”€โ”€ requirements.txt
โ””โ”€โ”€ assets/
    โ””โ”€โ”€ method_exit.png

โš ๏ธ Usage: enter infinity/ or hart/ folder and run evaluation scripts.


๐Ÿ”‘ Model Weights Setup

Infinity

  • Download from Infinity repo:
    • infinity_2b_reg.pth
    • infinity_vae_d32_reg.pth
  • Download Mask2Former:
  • Place files into:
    • infinity/weights/infinity_2b_reg.pth
    • infinity/weights/infinity_vae_d32_reg.pth
    • infinity/weights/mask2former/mask2former_swin-s-p4-w7-224_lsj_8x2_50e_coco.pth

HART

  • Download hart-0.7b-1024px
    โ†’ Place into hart/weights/
  • Download Mask2Former:
    • mask2former_swin-s-p4-w7-224_lsj_8x2_50e_coco.pth โ†’ Place into hart/weights/mask2former/

๐Ÿ“Š Prepare Datasets for Evaluation

GenEval

DPG-Bench


โ–ถ๏ธ Running Evaluation

Infinity

cd infinity
bash scripts/eval_sparsevar.sh   # SparseVAR acceleration
bash scripts/eval_baseline.sh    # Baseline

HART

cd hart
bash scripts/eval_sparsevar.sh
bash scripts/eval_baseline.sh

๐Ÿ“– Citation

@inproceedings{chen2025sparsevar,
  title     = {Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis},
  author    = {Chen, Zhuokun and Fan, Jugang and Yu, Zhuowei and Zhuang, Bohan and Tan, Mingkui},
  booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year      = {2025}
}

๐Ÿ™ Acknowledgements

This repository is built upon and inspired by the excellent works:

We thank the authors and maintainers of these repositories for open-sourcing their code and models, which made this work possible.


๐Ÿ“ License

This repository is for academic research only. For Infinity and HART code/models, please follow their respective licenses.