Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

November 26, 2025 ยท View on GitHub

FakeVV dataset

Please note that the training sets for the third stage, which include fakett and fakesv, are not displayed due to access restrictions.

FakeVV dataset test set

The complete test dataset is defined in:

  • data_config/fakevv_test_data.json.json

Includes webpage url

  • data_config/fakevv_test_data_with_urls.json

The associated visual resources are hosted on Hugging Face:

FakeVV Three-Stage Training Dataset

The annotation files for the three-stage FakeVV training data are hosted at:

Requirements

Software Requirements

  • Python 3.9+
  • transformers>=4.49.0
  • flash-attn>=2.4.3
  • vllm>=0.7.3

Hardware Requirements

* estimated

MethodBits1.5B3B7B
GRPO Full Fine-TuningAMP2*24GB4*40GB8*40GB

Installation

cd fact-r1
pip install -e .

GRPO Training

bash examples/qwen2_5_vl_7b_fact_r1_grpo.sh

Merge Checkpoint in Hugging Face Format

python3 scripts/model_merger.py --local_dir path_to_your_last_actor_checkpoint

Note

We will not provide scripts for Long-CoT Instruction Tuning and Preference Alignment via DPO in this project. If you have such requirements, we recommend using LLaMA-Factory.

Thanks

We would like to thank the following repos for their great work: