Fine-tuning Models using LlamaFactory

March 18, 2026 ยท View on GitHub

Welcome, fellow researcher! We will guide you how to fine-tune your web agent on TimeWarp using LlamaFactory. We recommend using DeepSpeed ZeRO-3 for multi-GPU training.

Getting Started

A. Setting up LlamaFactory

  1. (Optional) Create a conda environment for training. This is optional but strongly recommended. If you don't have conda installed then please follow these instructions from here.
conda create -n timewarpTraining python=3.10
conda activate timewarpTraining
  1. Clone the official LlamaFactory repo from here and install the required dependencies. The commands are given here:
git clone --depth 1 https://github.com/hiyouga/LlamaFactory.git
cd LlamaFactory
pip install -e .
pip install -r requirements/metrics.txt
  1. (Optional) If you are training on multiple GPUs, install DeepSpeed:
pip install -r requirements/deepspeed.txt

B. Generating Training Data

TimeWarp models are trained on the trajectories of a teacher web agent. The generated trajectory is converted to the ShareGPT format for training.

  1. Collect teacher trajectories following the instructions in collectTeacherTraj/README.MD. Optionally, use the teacher trajectories that we generated from the GPT-5 agent by downloading them from huggingface/sparklabutah/TimeWarp-GPT5-Traces.
git clone https://huggingface.co/datasets/sparklabutah/TimeWarp-GPT5-Traces
  1. Run the convert2sgptArgs.py script to generate your desired training data in ShareGPT format. Instructions on using this script have been provided in convert2sgptUsage.md. We also provide the data for training on the action, thinking, memory, and planning tokens, with the AXT on the all TimeWarp tasks and versions using the GPT-5 model: timewarpTracesSingle.json.

  2. Once the training data is generated, place it in LlamaFactory/data.

  3. Update the dataset_info.json file with the dataset metadata. We also provide an example dataset_info.json file.

C. Train your Agent!

Training a web agent is simple.

  1. Update the .yaml files in LlamaFactory/examples/train_full and LlamaFactory/examples/train_lora for full fine-tuning and LoRA fine-tuning, respectively. Example files are provided in train_full and train_lora.

  2. Train your agent by running the command (make sure you are in the LlamaFactory directory):

llamafactory-cli train examples/train_full/your_training_config.yaml

Great job, your agent is now being trained! You will most likely encounter tons of technical hurdles while following these steps. Setting the environment and training for the first time can be the hardest step. Feel free to reach out to the authors for any kinda difficulty!

Citation

Don't forget to cite LlamaFactory for providing their amazing repo.

@inproceedings{zheng2024llamafactory,
  title={LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models},
  author={Yaowei Zheng and Richong Zhang and Junhao Zhang and Yanhan Ye and Zheyan Luo and Zhangchi Feng and Yongqiang Ma},
  booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)},
  address={Bangkok, Thailand},
  publisher={Association for Computational Linguistics},
  year={2024},
  url={http://arxiv.org/abs/2403.13372}
}