πQuick Start
July 31, 2025 Β· View on GitHub
Data preprocessing
Please follow the Dataset Access section of the README.md to prepare the data, and ensure that the structure of the ./data directory is as shown below:
GUI-Odyssey
βββ data
β βββ annotations
β β βββ *.json
β βββ screenshots
β β βββ *.png
β βββ splits
β β βββ app_split.json
β β βββ device_split.json
β β βββ random_split.json
β β βββ task_split.json
β βββ format_converter.py
βββ ...
Next, run the following command to generate chat-format data for training and testing. You can adjust the following parameters as needed:
--his_lenspecifies the length of historical information to include (default: 4).--levelsets the instruction granularity, with choices of 'high' or 'low' (default: 'high').--typesets the annotation type, with choices of 'semantic' or 'standard' (default: 'standard').
cd data
python format_converter.py --his_len 4 --level high --type standard
Build OdysseyAgent upon Qwen-VL-Chat
The OdysseyAgent is bulit upon Qwen-VL.
Before running, set up the environment and install the required packages:
cd src
pip install -r requirements.txt
Next, initialize OdysseyAgent using the weights from Qwen-VL-Chat:
python merge_weight.py
Further, we also provide two variants of OdysseyAgent trained on Train-Random with semantic annotation: OdysseyAgent-random-high and OdysseyAgent-random-low, which are trained with high-level and low-level instructions, respectively.
Fine-tuning
Specify the path to the OdysseyAgent and the chat-format training data generated in the Data preprocessing stage in the script/train.sh file. Then, run the following command:
cd src
bash script/train.sh
Evalutaion
Specify the path to the checkpoint and dataset split (one of low_app_split, low_device_split, low_random_split, low_task_split high_app_split, high_device_split, high_random_split, high_task_split) in the script/eval.sh file. Then, run the following command:
cd src
bash script/eval.sh