README.md
April 2, 2026 · View on GitHub
Forest-Chat: Adapting Vision-Language Models for Interactive Forest Change Analysis
Give us a :star: if you find the repo useful!
This is the official Python implementation of the paper: "Forest-Chat: Adapting vision-language agents for interactive forest change analysis" in [Ecological Informatics]
Table of Contents
- Preparation
- LEVIR-MCI-Trees dataset
- Forest-Change dataset
- JL1-CD-Trees dataset
- Training of the adapted multi-level change interpretation model
- GPT-4o Zero-Shot and Refinement Captioning
- Few-Shot Fine-Tuning
- AnyChange2 (SAM2-based)
- Hyperparameter Search (AnyChange and AnyChange2)
- Construction of Forest-Chat
- Acknowledgement
- License
Preparation
-
Environment Installation:
Step 1: Create a virtual environment named
Multi_change_envand activate it.conda create -n Multi_change_env python=3.11 conda activate Multi_change_envStep 2: Download or clone the repository.
git clone https://github.com/JamesBrockUoB/ForestChat.git cd ./ForestChat/Multi_changeStep 3: Install dependencies.
pip install -r requirements.txtStep 4: Setup .env file. Create a file in the project root folder called
.envwith the following variables: - OPENAI_API_KEY: Your OPEN_AI API key - https://platform.openai.com/api-keys - SERPER_API_KEY - Your Google Search / Scholar API key - https://serpapi.com/ - WANDB_USERNAME - Your Weights & Biases username for run logging - https://wandb.ai/site/Step 5: Download AnyChange
Link: AnyChange
Place the downloaded model into:
Multi_change/models_ckpt/with the name unchanged (sam_vit_h_4b8939.pth)Simplified overview of the AnyChange model:
LEVIR-MCI-Trees dataset
-
Download the LEVIR_MCI dataset: LEVIR-MCI.
-
This dataset is an extension of the previously established LEVIR-CC dataset. It contains bi-temporal images as well as diverse change detection masks and descriptive sentences. It provides a crucial data foundation for exploring multi-task learning for change detection and change captioning.
-
IMPORTANT: Rename the folder to
LEVIR-MCI-Trees-dataset -
The data structure of LEVIR-MCI-Trees is organized as follows:
├─/DATA_PATH_ROOT/LEVIR-MCI-Trees-dataset/ ├─LevirCCcaptions.json ├─images ├─train │ ├─A │ ├─B │ ├─label ├─val │ ├─A │ ├─B │ ├─label ├─test │ ├─A │ ├─B │ ├─labelwhere folder
Acontains pre-phase images, folderBcontains post-phase images, and folderlabelcontains the change detection masks. -
Filter out examples that don't contain tree/forest related captions and extract text files for the descriptions of each image pair in LEVIR-MCI-Trees:
python preprocess_data.py --dataset LEVIR-MCI-Trees-dataset --captions_json LevirCCcaptions.jsonAfter running, you will find some generated files in
./data/LEVIR-MCI-Trees/.
Forest-Change dataset
- Data is available in the
Multi_change/data/Forest-Changefolder and can be prepared by runningpython preprocess_data.pyinMulti_change. - If you wish to download the original data and create your own captions, then download the images from here
- Name the downloaded folder as
archive, and place it in theMulti_change/datafolder - In the
dataset_utils_notebook.ipynbfile in the project root, run the first three code blocks to format the downloaded data as required - This should create the
Forest-Change-datasetfolder in the /data directory. - From here, you can run the captioning app via
streamlit run captioning_app.pyin the/Multi_changedirectory. This will allow you to provide a single human annotated caption, and optionally four rule-based captions per sample. Future work will allow for any number of human captions to be provided. - Once captioning is complete, the data can be pre-processed as needed by running
python preprocess_data.pyin/Multi_change
Captioning app screenshot
Forest-Change dataset examples
JL1-CD-Trees dataset
- Data is available in the
Multi_change/data/JL1-CD-Treesfolder. - If you wish to download the original data, it is available at JL1-CD
- The data structure of JL1-CD-Trees is organized as follows:
├─/DATA_PATH_ROOT/JL1-CD-Trees-dataset/
├─images
├─train
│ ├─A
│ ├─B
│ ├─label
├─val
│ ├─A
│ ├─B
│ ├─label
├─test
│ ├─A
│ ├─B
│ ├─label
- Note: JL1-CD-Trees supports change detection only — no captions are available. When running any script that requires captioning parameters, use Forest-Change parameters as defaults. For example:
--list_path ./data/Forest-Change/ --token_folder ./data/Forest-Change/tokens/
## Training of the adapted multi-level change interpretation model
The overview of the MCI model as adapted to Forest-Chat:
<br>
<div align="center">
<img src="resource/mci_model_forestchat.png" width="800"/>
</div>
<br>
### Train
Make sure you performed the data preparation above. Then, start training as follows:
```python
python train.py --train_goal 2 --savepath ./models_ckpt/
This is now configured to use the Forest-Change dataset by default, check commandline arguments and hard-coded constants for parameters that require updating to use LEVIR-MCI-Trees. E.g. --data_folder ./data/LEVIR-MCI-Trees-dataset/images --list_path ./data/LEVIR-MCI-Trees/ --token_folder ./data/LEVIR-MCI-Trees/tokens/ --data_name LEVIR-MCI-Trees --num_classes 3
Note that when evaluating on LEVIR-MCI-Trees, segmentation scores will come out as 3-class IoU scores if using num_classes = 3, rather than binary. If wanting binary, you will need to convert predictions manually via post-processing of output masks. A cell performs this function in the dataset_utils_notebook.ipynb file with the cell containing the function evaluate_folder.
Train
Make sure you performed the data preparation above. Then, start training as follows:
python train.py --train_goal 2 --savepath ./models_ckpt/
This is now configured to use the Forest-Change dataset by default, check commandline arguments and hard-coded constants for parameters that require updating to use LEVIR-MCI-Trees. E.g. --data_folder ./data/LEVIR-MCI-Trees-dataset/images --list_path ./data/LEVIR-MCI-Trees/ --token_folder ./data/LEVIR-MCI-Trees/tokens/ --data_name LEVIR-MCI-Trees --num_classes 3
Evaluate
python test.py --checkpoint {checkpoint_PATH}
We recommend training the model 5 times to get an average score.
This is now configured to use the Forest-Change dataset by default, check commandline arguments and hard-coded constants for parameters that require updating to use LEVIR-MCI-Trees. E.g. --data_folder ./data/LEVIR-MCI-Trees-dataset/images --list_path ./data/LEVIR-MCI-Trees/ --token_folder ./data/LEVIR-MCI-Trees/tokens/ --data_name LEVIR-MCI-Trees --num_classes 3
Inference
Run inference to get started as follows:
python predict.py --imgA_path {imgA_path} --imgB_path {imgA_path} --mask_save_path ./CDmask.png
You can modify --checkpoint of Change_Perception.define_args() in predict.py. Then you can use your own model, or use our pretrained models LEVIR-MCI-Trees_model.pth and Forest-Change_model.pth which are available at HuggingFace: Forest-Change and LEVIR-MCI-Trees.
Use --dataset to specify the dataset configuration (defaults to Forest-Change). Available options: Forest-Change, LEVIR-MCI-Trees, JL1-CD-Trees. For example:
python predict.py \
--imgA_path {imgA_path} \
--imgB_path {imgB_path} \
--mask_save_path ./CDmask.png \
--dataset JL1-CD-Trees
Note: JL1-CD-Trees does not support GPT-4o caption refinement as no ground-truth captions are available. Zero-shot captioning via GPT-4o is supported for all datasets.
This is now configured to use the Forest-Change dataset by default, check commandline arguments and hard-coded constants for parameters that require updating to use LEVIR-MCI-Trees. E.g. --data_folder ./data/LEVIR-MCI-Trees-dataset/images --list_path ./data/LEVIR-MCI-Trees/ --token_folder ./data/LEVIR-MCI-Trees/tokens/ --data_name LEVIR-MCI-Trees --num_classes 3
GPT-4o Zero-Shot and Refinement Captioning
Requires an OpenAI API key set in your .env file as OPENAI_API_KEY.
Zero-shot captioning queries GPT-4o directly with bi-temporal image pairs to generate change captions without any fine-tuning:
# Forest-Change (default)
python test_gpt4o_change_captioning.py \
--result_path ./predict_results/gpt4o
# LEVIR-MCI-Trees
python test_gpt4o_change_captioning.py \
--data_name LEVIR-MCI-Trees \
--data_folder ./data/LEVIR-MCI-Trees-dataset/images \
--list_path ./data/LEVIR-MCI-Trees/ \
--token_folder ./data/LEVIR-MCI-Trees/tokens/ \
--result_path ./predict_results/gpt4o
By default, dataset-specific prompts are used. To use a general prompt instead:
--use_general_prompt True
Refinement mode takes predictions from a trained model and uses GPT-4o to enrich them with spatial and contextual detail:
python test_gpt4o_change_captioning.py \
--predicted_captions ./predict_results/my_model/ \
--result_path ./predict_results/gpt4o_refined
Evaluate only (re-score already saved results without re-querying the API):
python test_gpt4o_change_captioning.py --eval_only True
Results are saved as .jsonl files and scores as .json files in --result_path. Metrics reported include BLEU-1 to BLEU-4, METEOR, ROUGE-L, CIDEr, and BERTScore F1.
Few-Shot Fine-Tuning
fewshot_train_test.py trains a model on a percentage of the target dataset and evaluates it in a single script, outputting metrics to CSV. This is used for cross-domain transfer experiments.
python fewshot_train_test.py \
--checkpoint ./models_ckpt/Forest-Change_model.pth \
--data_pct 25 \
--output_dir ./models_ckpt/few-shot-experiments/25pct \
--dataname JL1-CD-Trees \
--data_folder ./data/JL1-CD-Trees-dataset/images
Key arguments:
--data_pct— percentage of training data to use. Supported values:5,10,25,50,100--checkpoint— path to the pretrained source checkpoint to fine-tune from--train_script— usetrain.pyfor the MCI model (default) ortrain_benchmark.pyfor BiFA, Change3D, U-Net SiamDiff--benchmark— specify benchmark model when usingtrain_benchmark.py(e.g.bifa,change3d,unet_siamdiff)--output_dir— directory where the fine-tuned checkpoint, training log, andmetrics.csvare saved
Output files in --output_dir:
train.log— full training logcheckpoint.pth— best fine-tuned checkpointmetrics.csv— mIoU and per-class IoU on the test settest_results/test.log— full test log
AnyChange2 (SAM2-based)
AnyChange2 is a SAM2-based zero-shot change detection model. It requires a different checkpoint and config file from AnyChange v1.
Step 1: Download the SAM2 checkpoint and config files from: https://github.com/facebookresearch/sam2:
sam2.1_hiera_large.pt
Place it in Multi_change/models_ckpt/ and ensure the config file is at:
Multi_change/configs/sam2.1/sam2.1_hiera_l.yaml
Run AnyChange2 inference:
python test_anychange2.py \
--data_folder ./data/Forest-Change-dataset/images \
--anychange_network_path ./models_ckpt/sam2.1_hiera_large.pt \
--sam2_config_file ./configs/sam2.1/sam2.1_hiera_l.yaml \
--result_path ./predict_results
Key arguments:
--stability_score_thresh— filters unstable mask proposals (default:0.91)--change_conf_thresh— filters low-confidence change masks (default:155)--area_thresh— minimum mask area fraction to retain (default:0.9)--object_sim_thresh— bi-temporal object similarity threshold (default:50)
Hyperparameter Search (AnyChange and AnyChange2)
Bayesian hyperparameter search is available for both AnyChange versions using Weights & Biases sweeps. Requires WANDB_USERNAME set in your .env file.
AnyChange (SAM v1):
python anychange_hyperparameter_search.py \
--data_folder ./data/Forest-Change-dataset/images \
--anychange_network_path ./models_ckpt/sam_vit_h_4b8939.pth \
--run_count 20
AnyChange2 (SAM2):
python anychange2_hyperparameter_search.py \
--data_folder ./data/Forest-Change-dataset/images \
--anychange_network_path ./models_ckpt/sam2.1_hiera_large.pt \
--sam2_config_file ./configs/sam2.1/sam2.1_hiera_l.yaml \
--run_count 20
Both scripts search over: points_per_side, change_confidence_threshold, stability_score_thresh, area_thresh, and object_sim_thresh. To resume an existing sweep rather than creating a new one, pass --sweep_id <your_sweep_id>.
Construction of Forest-Chat
-
Agent Installation:
cd ./ForestChat/lagent-main pip install -e '.[all]' or pip install -e . -
Run Agent:
cd into the
Multi_changefolder:cd ./ForestChat/Multi_change(1) Run Agent Cli Demo:
# You need to install streamlit first # pip install streamlit python try_chat.py(2) Run Agent Web Demo:
# You need to install streamlit first # pip install streamlit streamlit run web_demo.py
Citation
If you find our work useful to your research, please consider citing:
@article{BROCK2026103741,
title = {Forest-Chat: Adapting vision-language agents for interactive forest change analysis},
journal = {Ecological Informatics},
volume = {95},
pages = {103741},
year = {2026},
issn = {1574-9541},
doi = {https://doi.org/10.1016/j.ecoinf.2026.103741},
url = {https://www.sciencedirect.com/science/article/pii/S1574954126001470},
author = {James Brock and Ce Zhang and Nantheera Anantrasirichai},
keywords = {Vision-Language models, Multi-task learning, Change interpretation, Zero-shot change detection and captioning, LLM agents},
abstract = {The increasing availability of high-resolution satellite imagery, together with advances in deep learning, creates new opportunities for forest monitoring workflows. Two central challenges in this domain are pixel-level change detection and semantic change interpretation, particularly for complex forest dynamics. While large language models (LLMs) are increasingly adopted for data exploration, their integration with vision-language models (VLMs) for remote sensing image change interpretation (RSICI) remains underexplored, especially beyond urban environments. This paper introduces Forest-Chat, an LLM-driven agent for forest change analysis, enabling natural language querying across multiple RSICI tasks, including change detection and captioning, object counting, deforestation characterisation, and change reasoning. Forest-Chat builds upon a multi-level change interpretation (MCI) vision-language backbone with LLM-based orchestration, incorporating zero-shot change detection via AnyChange and multimodal LLM-based zero-shot change captioning and refinement. To support adaptation and evaluation in forest environments, we introduce the Forest-Change dataset, comprising bi-temporal satellite imagery, pixel-level change masks, and semantic change captions generated through human annotation and rule-based methods. Forest-Chat achieves mIoU and BLEU-4 scores of 67.10% and 40.17% on Forest-Change, and 88.13% and 34.41% on LEVIR-MCI-Trees, a tree-focused subset of LEVIR-MCI. In a zero-shot capacity, it achieves 60.15% and 34.00% on Forest-Change, and 47.32% and 18.23% on LEVIR-MCI-Trees respectively. Further experiments demonstrate the value of caption refinement for injecting geographic domain knowledge into supervised captions, and the system’s limited label domain transfer onto JL1-CD-Trees. These findings demonstrate that interactive, LLM-driven systems can support accessible and interpretable forest change analysis. Datasets and code are publicly available https://github.com/JamesBrockUoB/ForestChat.}
}
Acknowledgement
Thanks to the following repositories:
Change-Agent; AnyChange; lagent; JL1-CD; Hewarathna et al.; SAM2
License
This repo is distributed under MIT License. The code can be used for academic purposes only.