ComfyUI-CacheDiT ⚡

February 11, 2026 · View on GitHub

ComfyUI-CacheDiT ⚡

One-Click DiT Model Acceleration for ComfyUI

cache-dit ComfyUI License

Quality Comparison (Z-Image-Base, 50 steps)

w/o Cache-DiT Accelerationw/ Cache-DiT Acceleration

Guidance Video (Click below)

ComfyUI-CacheDiT Tutorial

Thanks to Benji for the excellent tutorial!


Overview

ComfyUI-CacheDiT brings 1.4-1.6x speedup to DiT (Diffusion Transformer) models through intelligent caching, with zero configuration required.

Inspired by llm-scaler, a high-performance GenAI solution for text, image, and video generation on Intel XPU.

Tested & Verified Models

ModelStepsSpeedupWarmupSkip_interval
Z-Image501.3x105
Z-Image-Turbo91.5x32
Qwen-Image-2512501.4-1.6x53
Flux.2 Klein 4B201.67x42
Flux.2 Klein 9B201.67x42
LTX-2 T2V202.0x64
LTX-2 I2V202.0x64
WAN2.2 14B T2V201.67x42
WAN2.2 14B I2V201.67x42

Installation

Prerequisites

pip install -r requirements.txt

Install Node

Clone Repository

cd ComfyUI/custom_nodes/
git clone https://github.com/Jasonzzt/ComfyUI-CacheDiT.git

Quick Start

Ultra-Simple Usage (3 Steps)

For Image Models (Z-Image, Qwen-Image Flux.2 Klein):

  1. Load your model
  2. Connect to ⚡ CacheDiT Accelerator node
  3. Connect to KSampler - Done!
[Load Checkpoint] → [⚡ CacheDiT Accelerator] → [KSampler]

For Video Models (LTX-2, WAN2.2 14B):

LTX-2 Models:

[Load Checkpoint] → [⚡ LTX2 Cache Optimizer] → [Stage 1 KSampler]

WAN2.2 14B Models (High-Noise + Low-Noise MoE):

[High-Noise Model] → [⚡ Wan Cache Optimizer] → [KSampler]
                                               
[Low-Noise Model]  → [⚡ Wan Cache Optimizer] → [KSampler]

Each expert model gets its own optimizer node with independent cache.

Node Parameters

ParameterTypeDefaultDescription
modelMODEL-Input model (required)
enableBooleanTrueEnable/disable acceleration
model_typeComboAutoAuto-detect or select preset
print_summaryBooleanTrueShow performance dashboard

That's it! All technical parameters (threshold, fn_blocks, warmup, etc.) are automatically configured based on your model type.

How It Works

Caching Logic:

# After warmup phase (first 3 steps)
if (current_step - warmup) % skip_interval == 0:
    # Reuse cached result
    result = cache
else:
    # Compute new result
    result = transformer.forward(...)
    cache = result.detach()  # Save to cache

Credits

Based on cache-dit by Vipshop's Machine Learning Platform Team.

Built for ComfyUI - the powerful and modular Stable Diffusion GUI.

FAQ

Note for LTX-2: This audio-visual transformer uses dual latent paths (video + audio). Use the dedicated ⚡ LTX2 Cache Optimizer node (not the standard CacheDiT node) for optimal temporal consistency and quality.

Note for WAN2.2 14B: This model uses a MoE (Mixture of Experts) architecture with High-Noise and Low-Noise models. Use the dedicated ⚡ Wan Cache Optimizer node (not the standard CacheDiT node) for best results.

Other DiT models should work with auto-detection, but may need manual preset selection.

Q: Does it support distilled low step models?

A: Currently, only Z-Image-Turbo (9 steps) has been tested and verified. Other low-step distilled models require further validation.

For extremely low step counts (< 6 steps), the warmup overhead significantly reduces the benefit - sacrificing quality for minimal speed gains is generally not worthwhile in such cases.

Q: How can I disable the node without restarting ComfyUI?

A: Simply set enable=False in the node and run it once. This will cleanly remove the CacheDiT optimization from your model without requiring a restart.

Q: Performance Dashboard shows 0% cache hit?

A: This usually means:

  1. Model not properly detected - try manual preset selection
  2. Inference steps too short (< 10 steps) - warmup takes most steps
  3. Check logs for "Lightweight cache enabled" message

Q: Does this affect image quality?

A: Properly configured (default settings), quality impact is minimal:


Star ⭐ this repo if you find it useful!