Fine-tuning Models using LlamaFactory
March 18, 2026 ยท View on GitHub
Welcome, fellow researcher! We will guide you how to fine-tune your web agent on TimeWarp using LlamaFactory. We recommend using DeepSpeed ZeRO-3 for multi-GPU training.
Getting Started
A. Setting up LlamaFactory
- (Optional) Create a conda environment for training. This is optional but strongly recommended. If you don't have conda installed then please follow these instructions from here.
conda create -n timewarpTraining python=3.10
conda activate timewarpTraining
- Clone the official LlamaFactory repo from here and install the required dependencies. The commands are given here:
git clone --depth 1 https://github.com/hiyouga/LlamaFactory.git
cd LlamaFactory
pip install -e .
pip install -r requirements/metrics.txt
- (Optional) If you are training on multiple GPUs, install DeepSpeed:
pip install -r requirements/deepspeed.txt
B. Generating Training Data
TimeWarp models are trained on the trajectories of a teacher web agent. The generated trajectory is converted to the ShareGPT format for training.
- Collect teacher trajectories following the instructions in
collectTeacherTraj/README.MD. Optionally, use the teacher trajectories that we generated from the GPT-5 agent by downloading them fromhuggingface/sparklabutah/TimeWarp-GPT5-Traces.
git clone https://huggingface.co/datasets/sparklabutah/TimeWarp-GPT5-Traces
-
Run the
convert2sgptArgs.pyscript to generate your desired training data inShareGPTformat. Instructions on using this script have been provided inconvert2sgptUsage.md. We also provide the data for training on the action, thinking, memory, and planning tokens, with the AXT on the all TimeWarp tasks and versions using the GPT-5 model:timewarpTracesSingle.json. -
Once the training data is generated, place it in
LlamaFactory/data. -
Update the
dataset_info.jsonfile with the dataset metadata. We also provide an exampledataset_info.jsonfile.
C. Train your Agent!
Training a web agent is simple.
-
Update the
.yamlfiles inLlamaFactory/examples/train_fullandLlamaFactory/examples/train_lorafor full fine-tuning and LoRA fine-tuning, respectively. Example files are provided intrain_fullandtrain_lora. -
Train your agent by running the command (make sure you are in the LlamaFactory directory):
llamafactory-cli train examples/train_full/your_training_config.yaml
Great job, your agent is now being trained! You will most likely encounter tons of technical hurdles while following these steps. Setting the environment and training for the first time can be the hardest step. Feel free to reach out to the authors for any kinda difficulty!
Citation
Don't forget to cite LlamaFactory for providing their amazing repo.
@inproceedings{zheng2024llamafactory,
title={LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models},
author={Yaowei Zheng and Richong Zhang and Junhao Zhang and Yanhan Ye and Zheyan Luo and Zhangchi Feng and Yongqiang Ma},
booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations)},
address={Bangkok, Thailand},
publisher={Association for Computational Linguistics},
year={2024},
url={http://arxiv.org/abs/2403.13372}
}