Configuration Reference

June 24, 2025 · View on GitHub

The trainer uses structured Pydantic models for configuration, making it easy to customize training parameters. This guide covers all available configuration options and their usage.

📋 Overview

The main configuration class is LtxvTrainerConfig, which includes the following sub-configurations:

ModelConfig: Base model and training mode settings
LoraConfig: LoRA training parameters
ConditioningConfig: Video conditioning settings (reference videos, first frame conditioning)
OptimizationConfig: Learning rate, batch sizes, and scheduler settings
AccelerationConfig: Mixed precision and other optimization settings
DataConfig: Data loading parameters
ValidationConfig: Validation and inference settings
CheckpointsConfig: Checkpoint saving frequency and retention settings
HubConfig: Hugging Face Hub integration settings
FlowMatchingConfig: Timestep sampling parameters

📄 Example Configuration Files

Check out our example configurations in the configs directory. You can use these as templates for your training runs:

📄 LTXV 2B Full Model Fine-tuning Example
📄 LTXV 2B LoRA training Example
📄 LTXV 13B LoRA training Example
📄 LTXV 2B LoRA Fine-tuning Example (Low VRAM) - Optimized for GPUs with 24GB VRAM
📄 LTXV 13B IC-LoRA Training Example - Video-to-video transformation training

⚙️ Configuration Sections

ModelConfig

Controls the base model and training mode settings.

model:
  model_source: "LTXV_13B_097_DEV"  # Model version, HuggingFace repo, or local path
  training_mode: "lora"             # "lora" or "full"
  load_checkpoint: null             # Path to checkpoint file/directory to resume from

Key parameters:

model_source: Model to use - can be a model version (see model_loader.py), HuggingFace repo ID, or local path
training_mode: Training approach - either "lora" for LoRA training or "full" for full-rank model fine-tuning
load_checkpoint: Optional path to a checkpoint to resume the training from

LoraConfig

LoRA-specific fine-tuning parameters (only used when training_mode: "lora").

lora:
  rank: 64                       # LoRA rank (higher = more parameters, more flexibility)
  alpha: 64                      # LoRA alpha scaling factor
  dropout: 0.0                   # Dropout probability (0.0-1.0)
  target_modules:                # Modules to apply LoRA to
    - "to_k"
    - "to_q"
    - "to_v"
    - "to_out.0"

Key parameters:

rank: LoRA rank - higher values mean more trainable parameters and potentially more flexibility (typical range: 16-128)
alpha: Alpha scaling factor - usually set equal to rank
dropout: Dropout probability for regularization
target_modules: List of transformer modules (can include wildchar characters) to apply LoRA adapters to.

ConditioningConfig

Video conditioning settings for specialized training modes.

conditioning:
  mode: "none"                            # "none" or "reference_video"
  first_frame_conditioning_p: 0.1         # Probability of first-frame conditioning
  reference_latents_dir: "reference_latents"  # Directory for reference video latents

Key parameters:

mode: Conditioning type - "none" for standard training, "reference_video" for IC-LoRA
first_frame_conditioning_p: Probability of using first frame as conditioning (0.0-1.0)
reference_latents_dir: Directory name for reference video latents (IC-LoRA only)

OptimizationConfig

Training optimization parameters including learning rates, batch sizes, and schedulers.

optimization:
  learning_rate: 1e-4              # Learning rate
  steps: 3000                      # Total training steps
  batch_size: 2                    # Batch size per GPU
  gradient_accumulation_steps: 1   # Steps to accumulate gradients
  max_grad_norm: 1.0              # Gradient clipping threshold
  optimizer_type: "adamw"         # "adamw" or "adamw8bit"
  scheduler_type: "linear"        # Scheduler type
  scheduler_params: {}            # Additional scheduler parameters
  enable_gradient_checkpointing: false  # Memory optimization at cost of speed

Key parameters:

learning_rate: Learning rate for optimization (typical range: 1e-5 to 1e-3)
steps: Total number of training steps
batch_size: Batch size per GPU (reduce if running out of memory)
gradient_accumulation_steps: Accumulate gradients over multiple steps (increases effective batch size)
scheduler_type: Learning rate scheduler - "constant", "linear", "cosine", "cosine_with_restarts", "polynomial"
enable_gradient_checkpointing: Trade training speed for GPU memory savings (required for LTXV 13B)

AccelerationConfig

Hardware acceleration and compute optimization settings.

acceleration:
  mixed_precision_mode: "bf16"     # "no", "fp16", or "bf16"
  quantization: null               # Quantization options
  load_text_encoder_in_8bit: false  # Load text encoder in 8-bit
  compile_with_inductor: true      # Enable PyTorch compilation
  compilation_mode: "reduce-overhead"  # Compilation optimization mode

Key parameters:

mixed_precision_mode: Precision mode - "bf16" recommended for modern GPUs, "fp16" for older ones
quantization: Quantization precision for model weights. Options include null (no quantization), "int8-quanto", "int4-quanto", "int2-quanto", "fp8-quanto", and "fp8uz-quanto". Use quantization to reduce memory usage, especially for large models or limited hardware.
load_text_encoder_in_8bit: Load the text encoder in 8-bit to save GPU memory
compile_with_inductor: Enable torch.compile() compilation for speed improvements
compilation_mode: Compilation strategy - "default", "reduce-overhead", "max-autotune"

DataConfig

Data loading and processing configuration.

data:
  preprocessed_data_root: "path/to/preprocessed/data"  # Path to precomputed dataset directory
  num_dataloader_workers: 2                           # Background data loading workers

Key parameters:

preprocessed_data_root: Path to your preprocessed dataset (contains the .precomputed directory)
num_dataloader_workers: Number of parallel data loading processes (0 = synchronous loading)

ValidationConfig

Validation and inference settings for monitoring training progress.

validation:
  prompts:                        # Validation prompts
    - "A cat playing with a ball"
    - "A dog running in a field"
  negative_prompt: "worst quality, inconsistent motion, blurry, jittery, distorted"
  images: null                    # Optional list of image paths for image-to-video
  reference_videos: null          # Reference video paths (IC-LoRA only)
  video_dims: [704, 480, 161]     # Video dimensions [width, height, frames]
  seed: 42                        # Random seed for reproducibility
  inference_steps: 50             # Number of inference steps
  interval: 100                   # Steps between validation runs
  videos_per_prompt: 1            # Videos generated per prompt
  guidance_scale: 3.0             # CFG guidance strength

Key parameters:

prompts: List of text prompts for validation video generation
images: List of image paths for image-to-video validation (must match number of prompts)
interval: Steps between validation runs (set to null to disable)
inference_steps: Number of denoising steps for validation videos
video_dims: Output video dimensions [width, height, frames]
reference_videos: List of paths to reference videos. Required for IC-LoRA validation (must match number of prompts)

CheckpointsConfig

Model checkpointing configuration.

checkpoints:
  interval: null      # Steps between checkpoint saves (null = disabled)
  keep_last_n: 5      # Number of recent checkpoints to retain

Key parameters:

interval: Steps between intermediate checkpoint saves (set to null to disable checkpoint saving)
keep_last_n: Number of most recent checkpoints to keep (older ones are deleted)

HubConfig

Hugging Face Hub integration for automatic model uploads.

hub:
  push_to_hub: false                    # Enable Hub uploading
  hub_model_id: "username/model-name"   # Hub repository ID

Key parameters:

push_to_hub: Whether to automatically push trained models to Hugging Face Hub
hub_model_id: Repository ID in format "username/repository-name"

FlowMatchingConfig

Flow matching training configuration for timestep sampling.

flow_matching:
  timestep_sampling_mode: "shifted_logit_normal"  # Timestep sampling strategy
  timestep_sampling_params: {}                    # Additional sampling parameters

Key parameters:

timestep_sampling_mode: Sampling strategy - "uniform" or "shifted_logit_normal"
timestep_sampling_params: Additional parameters for the sampling strategy

🚀 Next Steps

Once you've configured your training parameters:

Set up your dataset using Dataset Preparation
Choose your training approach in Training Modes
Start training with the Training Guide