Configuration Reference

June 24, 2025 ยท View on GitHub

The trainer uses structured Pydantic models for configuration, making it easy to customize training parameters. This guide covers all available configuration options and their usage.

๐Ÿ“‹ Overview

The main configuration class is LtxvTrainerConfig, which includes the following sub-configurations:

  • ModelConfig: Base model and training mode settings
  • LoraConfig: LoRA training parameters
  • ConditioningConfig: Video conditioning settings (reference videos, first frame conditioning)
  • OptimizationConfig: Learning rate, batch sizes, and scheduler settings
  • AccelerationConfig: Mixed precision and other optimization settings
  • DataConfig: Data loading parameters
  • ValidationConfig: Validation and inference settings
  • CheckpointsConfig: Checkpoint saving frequency and retention settings
  • HubConfig: Hugging Face Hub integration settings
  • FlowMatchingConfig: Timestep sampling parameters

๐Ÿ“„ Example Configuration Files

Check out our example configurations in the configs directory. You can use these as templates for your training runs:

โš™๏ธ Configuration Sections

ModelConfig

Controls the base model and training mode settings.

model:
  model_source: "LTXV_13B_097_DEV"  # Model version, HuggingFace repo, or local path
  training_mode: "lora"             # "lora" or "full"
  load_checkpoint: null             # Path to checkpoint file/directory to resume from

Key parameters:

  • model_source: Model to use - can be a model version (see model_loader.py), HuggingFace repo ID, or local path
  • training_mode: Training approach - either "lora" for LoRA training or "full" for full-rank model fine-tuning
  • load_checkpoint: Optional path to a checkpoint to resume the training from

LoraConfig

LoRA-specific fine-tuning parameters (only used when training_mode: "lora").

lora:
  rank: 64                       # LoRA rank (higher = more parameters, more flexibility)
  alpha: 64                      # LoRA alpha scaling factor
  dropout: 0.0                   # Dropout probability (0.0-1.0)
  target_modules:                # Modules to apply LoRA to
    - "to_k"
    - "to_q"
    - "to_v"
    - "to_out.0"

Key parameters:

  • rank: LoRA rank - higher values mean more trainable parameters and potentially more flexibility (typical range: 16-128)
  • alpha: Alpha scaling factor - usually set equal to rank
  • dropout: Dropout probability for regularization
  • target_modules: List of transformer modules (can include wildchar characters) to apply LoRA adapters to.

ConditioningConfig

Video conditioning settings for specialized training modes.

conditioning:
  mode: "none"                            # "none" or "reference_video"
  first_frame_conditioning_p: 0.1         # Probability of first-frame conditioning
  reference_latents_dir: "reference_latents"  # Directory for reference video latents

Key parameters:

  • mode: Conditioning type - "none" for standard training, "reference_video" for IC-LoRA
  • first_frame_conditioning_p: Probability of using first frame as conditioning (0.0-1.0)
  • reference_latents_dir: Directory name for reference video latents (IC-LoRA only)

OptimizationConfig

Training optimization parameters including learning rates, batch sizes, and schedulers.

optimization:
  learning_rate: 1e-4              # Learning rate
  steps: 3000                      # Total training steps
  batch_size: 2                    # Batch size per GPU
  gradient_accumulation_steps: 1   # Steps to accumulate gradients
  max_grad_norm: 1.0              # Gradient clipping threshold
  optimizer_type: "adamw"         # "adamw" or "adamw8bit"
  scheduler_type: "linear"        # Scheduler type
  scheduler_params: {}            # Additional scheduler parameters
  enable_gradient_checkpointing: false  # Memory optimization at cost of speed

Key parameters:

  • learning_rate: Learning rate for optimization (typical range: 1e-5 to 1e-3)
  • steps: Total number of training steps
  • batch_size: Batch size per GPU (reduce if running out of memory)
  • gradient_accumulation_steps: Accumulate gradients over multiple steps (increases effective batch size)
  • scheduler_type: Learning rate scheduler - "constant", "linear", "cosine", "cosine_with_restarts", "polynomial"
  • enable_gradient_checkpointing: Trade training speed for GPU memory savings (required for LTXV 13B)

AccelerationConfig

Hardware acceleration and compute optimization settings.

acceleration:
  mixed_precision_mode: "bf16"     # "no", "fp16", or "bf16"
  quantization: null               # Quantization options
  load_text_encoder_in_8bit: false  # Load text encoder in 8-bit
  compile_with_inductor: true      # Enable PyTorch compilation
  compilation_mode: "reduce-overhead"  # Compilation optimization mode

Key parameters:

  • mixed_precision_mode: Precision mode - "bf16" recommended for modern GPUs, "fp16" for older ones
  • quantization: Quantization precision for model weights. Options include null (no quantization), "int8-quanto", "int4-quanto", "int2-quanto", "fp8-quanto", and "fp8uz-quanto". Use quantization to reduce memory usage, especially for large models or limited hardware.
  • load_text_encoder_in_8bit: Load the text encoder in 8-bit to save GPU memory
  • compile_with_inductor: Enable torch.compile() compilation for speed improvements
  • compilation_mode: Compilation strategy - "default", "reduce-overhead", "max-autotune"

DataConfig

Data loading and processing configuration.

data:
  preprocessed_data_root: "path/to/preprocessed/data"  # Path to precomputed dataset directory
  num_dataloader_workers: 2                           # Background data loading workers

Key parameters:

  • preprocessed_data_root: Path to your preprocessed dataset (contains the .precomputed directory)
  • num_dataloader_workers: Number of parallel data loading processes (0 = synchronous loading)

ValidationConfig

Validation and inference settings for monitoring training progress.

validation:
  prompts:                        # Validation prompts
    - "A cat playing with a ball"
    - "A dog running in a field"
  negative_prompt: "worst quality, inconsistent motion, blurry, jittery, distorted"
  images: null                    # Optional list of image paths for image-to-video
  reference_videos: null          # Reference video paths (IC-LoRA only)
  video_dims: [704, 480, 161]     # Video dimensions [width, height, frames]
  seed: 42                        # Random seed for reproducibility
  inference_steps: 50             # Number of inference steps
  interval: 100                   # Steps between validation runs
  videos_per_prompt: 1            # Videos generated per prompt
  guidance_scale: 3.0             # CFG guidance strength

Key parameters:

  • prompts: List of text prompts for validation video generation
  • images: List of image paths for image-to-video validation (must match number of prompts)
  • interval: Steps between validation runs (set to null to disable)
  • inference_steps: Number of denoising steps for validation videos
  • video_dims: Output video dimensions [width, height, frames]
  • reference_videos: List of paths to reference videos. Required for IC-LoRA validation (must match number of prompts)

CheckpointsConfig

Model checkpointing configuration.

checkpoints:
  interval: null      # Steps between checkpoint saves (null = disabled)
  keep_last_n: 5      # Number of recent checkpoints to retain

Key parameters:

  • interval: Steps between intermediate checkpoint saves (set to null to disable checkpoint saving)
  • keep_last_n: Number of most recent checkpoints to keep (older ones are deleted)

HubConfig

Hugging Face Hub integration for automatic model uploads.

hub:
  push_to_hub: false                    # Enable Hub uploading
  hub_model_id: "username/model-name"   # Hub repository ID

Key parameters:

  • push_to_hub: Whether to automatically push trained models to Hugging Face Hub
  • hub_model_id: Repository ID in format "username/repository-name"

FlowMatchingConfig

Flow matching training configuration for timestep sampling.

flow_matching:
  timestep_sampling_mode: "shifted_logit_normal"  # Timestep sampling strategy
  timestep_sampling_params: {}                    # Additional sampling parameters

Key parameters:

  • timestep_sampling_mode: Sampling strategy - "uniform" or "shifted_logit_normal"
  • timestep_sampling_params: Additional parameters for the sampling strategy

๐Ÿš€ Next Steps

Once you've configured your training parameters: