Spectrum
April 30, 2026 · View on GitHub
中文说明 | English
1~4.79× speedup for diffusion sampling — Unofficial ComfyUI implementation of Spectrum (CVPR 2026). Training-free, plug-and-play.
Overview
Spectrum is a training-free diffusion sampling acceleration technique. It treats the internal features of the denoiser as functions over time and approximates them with Chebyshev polynomials in the spectral domain, enabling prediction and skipping of redundant network forward passes. Unlike prior methods (e.g., local Taylor expansion), Spectrum's approximation error does not compound with skip distance, maintaining sample quality even at high speedup ratios.
Currently supported models:
| Model | Detection Type | Quality |
|---|---|---|
| Klein 9b | Flux-like | Excellent |
| Longcat Image | Flux-like | Excellent |
| FLUX.1 | Flux-like | Excellent |
| Qwen Image (T2I) | MMDiT | Good |
| Z Image Turbo | Lumina2 | Excellent |
| ErnieImage | Ernie | Normal |
| Wan2.2 | Wan | Modest speedup (dual sampling, fewer steps per round) |
| HunyuanVideo 1.5 | Hunyuan | Normal |
| Qwen Image Edit | MMDiT | Poor (60 layers with split modulation; not recommended) |
| LTX2.3 | LTX | Untested (hardware-limited) |
The node requires warmup_steps (default 3) to build an initial cache, then gradually accelerates. More total steps = more noticeable speedup. For lightweight models like Z Image Turbo or Klein, warmup_steps can be set to 1.
Examples
Text-to-Image (qwen image):

Image Editing (klein base 9b):

Speed Comparison
All tests on RTX 4090 with default parameters (w=0.5, M=4, window_size=2, flex_window=0.75).
| Klein 9b | Z Image Turbo | Qwen Image | ErnieImage | |
|---|---|---|---|---|
![]() | ![]() | ![]() | ![]() |
How It Works
1. Features as Functions of Time
View each feature channel at the output of the denoiser's last attention block as a scalar function evolving along the diffusion timeline.
2. Global Approximation with Chebyshev Polynomials
Approximate each channel using Chebyshev basis functions:
where is the -th Chebyshev polynomial ().
Why Chebyshev? Its approximation error bound depends only on the degree , not on the forecast horizon (Theorem 3.3). In contrast, local Taylor expansion error grows polynomially with step size, causing quality collapse at large skips.
3. Online Ridge Regression
At each actual forward pass, collect block output features and corresponding times . Fit coefficients via ridge regression:
Solved as (negligible cost since is small).
4. Blended Prediction
Final features are a convex combination:
- Taylor term: discrete forward-difference extrapolation from nearest cached points — captures high-frequency details
- Chebyshev term: global spectral fit over all cached points — captures long-range trends
- controls the blend: larger skips favor Chebyshev, smaller skips favor Taylor
Analogy: Taylor prediction is like judging a car's next position by its taillight distance — accurate up close, wildly wrong at range. Chebyshev prediction is like reading the car's driving rhythm — you can predict 5 steps ahead almost as well as 1.
Parameters
w — Chebyshev / Taylor Blend Weight
- Formula:
- Range: 0.0 ~ 1.0, Default 0.5, Recommended 0.3 ~ 0.8
- w=0: pure Taylor (good for short skips); w=1: pure Chebyshev (stable for long skips)
- Dynamically adjusted: larger windows → higher w, capped at
max_w
M — Chebyshev Polynomial Degree
- Formula:
- Range: 1 ~ 10, Default 4, Recommended 3 ~ 6
- M=2 too coarse; M=4 sweet spot; M=6+ diminishing returns
lam (λ) — Ridge Regularization Strength
- Formula:
- Range: 0.001 ~ 10.0, Default 0.1, Recommended 0.01 ~ 1.0
- Too small → numerical instability; too large → underfitting. 0.1 is the paper's optimal value.
warmup_steps — Warmup Steps
- Range: 0 ~ 20, Default 3, Recommended 2 ~ 5
- First N steps always run full precision to build initial cache
- Set to 1 for lightweight models (Klein, Z Image Turbo)
- Set to total steps to disable acceleration entirely
window_size — Initial Skip Interval
- Formula: (paper's initial window size)
- Range: 1.0 ~ 16.0, Default 2.0, Recommended 1.5 ~ 4.0
- 1 = no skip; 2 = every other step; higher = more aggressive initially
flex_window (α) — Window Growth Rate
- Formula: (paper's adaptive scheduling slope)
- Range: 0.0 ~ 4.0, Default 0.75, Recommended 0.3 ~ 2.0
- Interval sequence:
window, window+α, window+2α, window+3α, ... - α=0: fixed schedule; α=0.75: gradual; α=3.0: aggressive
- Why grow? Early steps determine layout (error-sensitive), later steps refine details (error-tolerant)
- Step size 0.01 for precise tuning
Analogy: flex_window is your throttle. α=0 is cruise control, α=0.75 is gradual acceleration, α=3.0 is pedal-to-the-metal. "Slow first, fast later" is optimal.
max_w — Maximum Chebyshev Weight
- Range: 0.0 ~ 1.0, Default 0.8, Recommended 0.6 ~ 0.9
- Upper bound for dynamic w. Raise to 0.9 for extreme speedups; otherwise leave at 0.8.
verbose — Debug Logging
- Prints per-step FWD/SKIP decisions and window sizes for parameter tuning.
Recommended Presets
| Scenario | Parameters | Expected Speedup |
|---|---|---|
| Conservative (quality-first) | w=0.3, M=4, warmup=4, window=2, flex=0.3, max_w=0.6 | ≈2× |
| Balanced (default) | w=0.5, M=4, warmup=3, window=2, flex=0.75, max_w=0.8 | ≈3× |
| Aggressive (speed-first) | w=0.7, M=6, warmup=2, window=2, flex=2.0, max_w=0.9 | ≈4–5× |
| Image Editing | w=0.5, M=4, warmup=4, window=2, flex=0.5, max_w=0.8 | ≈2× |
Image Editing Notes
- Klein / Longcat Edit: Single blocks apply uniform modulation, smoothing main/ref token differences. Acceleration quality matches T2I.
- Qwen Image Edit: All 60 layers use split timestep_zero modulation. Main token step-to-step variation is 3× that of T2I, causing severe quality degradation. Use conservative parameters or disable acceleration.
Credits
This node was developed with assistance from Claude Code and DeepSeek. Licensed under the MIT License, same as the original project. Feel free to use and contribute.
Citation
@article{han2026adaptive,
title={Adaptive Spectral Feature Forecasting for Diffusion Sampling Acceleration},
author={Han, Jiaqi and Shi, Juntong and Li, Puheng and Ye, Haotian and Guo, Qiushan and Ermon, Stefano},
journal={arXiv preprint arXiv:2603.01623},
year={2026}
}



