ProG-V2: A Reproducible Toolkit for Graph Prompt Learning

May 31, 2026 · View on GitHub

ProG logo

ProG-V2 is an engineering-focused extension of the original ProG benchmark for graph prompt learning. It keeps the standard pre-train → prompt-tune → evaluate workflow, while adding a modular prompt-strategy architecture, broader prompt coverage, centralized path/device/logging utilities, benchmark scripts, tests, and public merged result reports.

What's New in ProG-V2

  • 17 prompt strategies registered through a PromptStrategy registry.
  • 6 GNN backbones registered through prompt_graph.model.build_gnn.
  • Reproducible few-shot benchmark utilities for node- and graph-level tasks.
  • Centralized filesystem paths, device resolution, logging, and CLI/YAML config.
  • Tests for data loading, GNN factory construction, strategy registration, and prompt-task smoke runs.
  • Fixes for several benchmark-blocking edge cases in WebKB, MultiGprompt, RELIEF, and GraphMAE.

Architecture

ProG pipeline

Benchmark Results

We publish two complementary public GCN benchmark reports: a node- and graph-classification report, and an edge-task (link-prediction) report. Both follow the same {pretrain}+{prompt} matrix format so they can be read and merged with the same tooling.

Node & Graph Classification

The classification report lives under results/benchmark-gcn/ and contains 714 independent (dataset, shot, pretrain+prompt) combinations and 2142 metric values over Accuracy, Macro-F1, and AUROC.

Experiment parameters:

SettingValue
BackboneGCN
GNN layers2
Hidden dimension128
Seed42
Shots1-shot, 3-shot, 5-shot
Few-shot splits5 splits per shot setting (mean±std)
Downstream budget50 epochs with early stopping
Pretrain budget200 epochs for generated checkpoints
MetricsAccuracy, Macro-F1, AUROC
Result format{pretrain}+{prompt} columns

Coverage:

DatasetTask1-shot3-shot5-shot
CoraNode727272
WisconsinNode595959
MUTAGGraph565656
PROTEINSGraph515151

Result files:

The link-prediction report lives under results/link-prediction-gcn/ and contains 2912 (dataset, shot, pretrain+prompt) combinations over Accuracy, F1, AUROC, and AUPRC.

SettingValue
BackboneGCN
DatasetsCiteSeer, Cora, IMDB-BINARY, MUTAG, PROTEINS, PTC_MR, PubMed, Wisconsin (8)
Shots0-shot, 1-shot, 3-shot, 5-shot
PretrainsNone, DGI, GraphMAE, Edgepred_GPPT, Edgepred_Gprompt, GraphCL, SimGRACE (7)
Prompts13 LinkTask-supported strategies
Combos per dataset91 (13 prompts × 7 pretrains) × 4 shots = 364 cells
Primary metricsAUROC, AUPRC (Accuracy/F1 kept for matrix compatibility)

Result files:

  • summary.csv: flat table, one row per experiment combination.
  • final_matrices.xlsx: 32 sheets, one per (dataset, shot) view (4 shots × 8 datasets).
  • README.md: detailed result documentation and metric definitions.

Both reports currently use GCN. Other backbones are available in the model registry but are not part of these public benchmark tables.

Installation

Use Python 3.9 or 3.11. Python 3.11 is recommended for local development.

conda create -n prog-v2 python=3.11 -y
conda activate prog-v2
pip install -e ".[dev]"
pre-commit install

If PyTorch Geometric extension wheels are not resolved automatically, install the matching wheels for your PyTorch/CUDA version from the official PyG wheel index:

python -m pip install torch_scatter torch_sparse -f https://data.pyg.org/whl/

Quick Start

Run a minimal downstream task:

python downstream_task.py \
  --downstream_task NodeTask \
  --dataset_name Cora \
  --gnn_type GCN \
  --prompt_type GPF \
  --shot_num 1 \
  --epochs 1 \
  --device cpu

Run a small benchmark cell and write an Excel matrix:

python scripts/bootstrap_excel_full.py --gnn_type GCN
python bench.py \
  --pretrain_task NodeTask \
  --dataset_name Cora \
  --prompt_type None \
  --gnn_type GCN \
  --shot_num 1 \
  --epochs 1 \
  --device cpu \
  --pre_train_model_path None \
  --num_iter 1

Run a LinkTask cell (link prediction, dot-product decoder):

python bench.py \
  --pretrain_task LinkTask \
  --dataset_name Cora \
  --prompt_type GPF \
  --gnn_type GCN \
  --shot_num 0 \
  --epochs 10 \
  --device cpu \
  --pre_train_model_path None \
  --num_iter 1

For a single-run LinkTask entry point, downstream_task.py also accepts --downstream_task LinkTask:

python downstream_task.py \
  --downstream_task LinkTask \
  --dataset_name Cora \
  --prompt_type None \
  --gnn_type GCN \
  --shot_num 0 \
  --epochs 2 \
  --device cpu \
  --pre_train_model_path None

Supported Components

Backbones

  • GCN
  • GAT
  • GIN
  • GraphSAGE
  • GCov
  • GraphTransformer

Pretraining Methods

  • DGI
  • GraphMAE
  • GraphCL
  • SimGRACE
  • Edgepred_GPPT
  • Edgepred_Gprompt
  • MultiGprompt

Prompt Strategies

None, GPF, GPF-plus, Gprompt, All-in-one, GPPT, Prodigy, GraphPrompter, EdgePrompt, EdgePromptplus, RELIEF, MultiGprompt, UniPrompt, SelfPro, ProNoG, PSP, and DAGPrompT.

Downstream Tasks

TaskClassDatasetsNotes
NodeTaskprompt_graph.tasker.NodeTaskNODE_TASKS (12)Node classification, supports all 17 prompts.
GraphTaskprompt_graph.tasker.GraphTaskGRAPH_TASKS (11)Graph classification, supports all 17 prompts.
LinkTaskprompt_graph.tasker.LinkTaskLINK_TASKS (16 curated)Link prediction with binary BCE + dot-product decoder for most prompts.

Scripts

The public sweep scripts are parameterized by --gnn_type:

bash scripts/pretrain_full_grid.sh --gnn_type GCN --fast
bash scripts/bench_full_grid.sh --gnn_type GCN --fast --datasets "Cora MUTAG"

Useful scripts:

ScriptPurpose
scripts/bootstrap_excel_full.pyCreate empty Excel matrices for a selected backbone.
scripts/pretrain_full_grid.shPretrain selected methods/datasets/backbone.
scripts/bench_full_grid.shRun the full project benchmark grid with filters.
scripts/merge_result_excels.pyMerge per-run Excel outputs into one report.
scripts/export_final_matrices.pyExport populated per-dataset matrices into summary.csv and final_matrices.xlsx.

Development Checks

ruff check .
ruff format --check .
pytest tests/ -v

For contribution guidelines, see CONTRIBUTING.md.

Citation

If you find this project useful, please cite the original ProG/graph prompt work:

@article{zi2024prog,
  title={ProG: A Graph Prompt Learning Benchmark},
  author={Chenyi Zi and Haihong Zhao and Xiangguo Sun and Yiqing Lin and Hong Cheng and Jia Li},
  year={2024},
  journal={Advances in Neural Information Processing Systems}
}

@inproceedings{sun2023all,
  title={All in One: Multi-Task Prompting for Graph Neural Networks},
  author={Sun, Xiangguo and Cheng, Hong and Li, Jia and Liu, Bo and Guan, Jihong},
  booktitle={Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  year={2023}
}