LLMNodeBed

May 20, 2025 ยท View on GitHub

This repository is the official implementation for our ICML 2025 paper: When Do LLMs Help With Node Classification? A Comprehensive Analysis. It provides a standardized framework for evaluating LLM-based node classification methods, including 14 datasets, 8 LLM-based algorithms, and 3 learning paradigms.

Please consider citing or giving a ๐ŸŒŸ if our repository is helpful to your work!

@inproceedings{wu2025llmnodebed,
      title={When Do LLMs Help With Node Classification? A Comprehensive Analysis}, 
      author={Xixi Wu and Yifei Shen and Fangzhou Ge and Caihua Shan and Yizhu Jiao and Xiangguo Sun and Hong Cheng},
      year={2025},
      booktitle={International Conference on Machine Learning},
      organization={PMLR},
      url={https://arxiv.org/abs/2502.00829}, 
}

๐ŸŽ™๏ธ News

๐ŸŽ‰ [2025-05-01] Our paper is accepted to ICML 2025. The camera ready paper, integration of more baseline methods, and corresponding blogs will be released soon!

๐Ÿ“… [2025-02-04] The code for LLMNodebed, along with the project pages and paper, has now been released! ๐Ÿงจ


๐Ÿ“ Table of Contents

๐Ÿš€ Quick Start

0. Environment Setup

To get started, follow these steps to set up your Python environment:

conda create -n NodeBed python=3.10
conda activate NodeBed
pip install torch torch_geometric transformers peft pytz scikit-learn torch_scatter torch_sparse

Some packages might be missed for specific algorithms. Check the algorithm READMD or error logs to identify any missing dependencies and install them accordingly.

1. LLM Preparation

  • Close-source LLMs like GPT-4o, DeepSeek-Chat:

    Add API keys to LLMZeroShot/Direct/api_keys.py

  • Open-source LLMs like Mistral-7B, Qwen:

    Download models from HuggingFace (e.g., Mistral-7B). Then, update model paths in common/model_path.py as you actual saving paths.

    Example paths:

    MODEL_PATHs = {
      "MiniLM": "sentence-transformers/all-MiniLM-L6-v2",
      "Mistral-7B": "mistralai/Mistral-7B-Instruct-v0.2",
      "Llama-8B": "meta-llama/Llama-3.1-8B-Instruct",
      # See full list in common/model_path.py
    }
    

2. Datasets

Download datasets either from Google Drive or HuggingFace and unzip into the datasets folder.

Before running LLM-based algorithms, please generate LM / LLM-encoded embeddings as follows:

cd LLMEncoder/GNN

python3 embedding.py --dataset=cora --encoder_name=roberta      # LM embeddings
python3 embedding.py --dataset=cora --encoder_name=Mistral-7B  # LLM embeddings

3. (Optional) Deploy Local LLMs

For LLM Direct Inference using open-source LLMs, we depoly them as local services based on the FastChat framework.

# Install dependencies
pip install vllm "fschat[model_worker,webui]"

# Start services
python3 -m fastchat.serve.controller --host 127.0.0.1
CUDA_VISIBLE_DEVICES=0 python3 -m fastchat.serve.vllm_worker --model-path mistralai/Mistral-7B-Instruct-v0.2 --host 127.0.0.1
python3 -m fastchat.serve.openai_api_server --host 127.0.0.1 --port 8008

Then, the Mistral-7B model can be invoked via the url http://127.0.0.1:8008/v1/chat/completions.

4. Run Algorithms

Refer to method-specific READMEs for execution details:

๐Ÿ“– Code Structure

LLMNodeBed/
โ”œโ”€โ”€ LLMEncoder/           # LLM-as-Encoder (GNN, ENGINE)
โ”œโ”€โ”€ LLMPredictor/         # LLM-as-Predictor (GraphGPT, LLaGA, Instruction Tuning)
โ”œโ”€โ”€ LLMReasoner/          # LLM-as-Reasoner (TAPE)
โ”œโ”€โ”€ LLMZeroShot/          # Zero-shot Methods (Direct Inference, ZeroG)
โ”œโ”€โ”€ common/               # Shared utilities
โ”œโ”€โ”€ datasets/             # Dataset storage
โ”œโ”€โ”€ results/              # Experiment outputs
โ””โ”€โ”€ requirements.txt

๐Ÿ”ง Supported Methods

MethodVeneueOfficial ImplementationOur Implementation
TAPEICLR'24linkLLMReasoner/TAPE
ENGINEIJCAI'24linkLLMEncoder/ENGINE
GraphGPTSIGIR'24linkLLMPredictor/GraphGPT
LLaGAICML'24linkLLMPredictor/LLaGA
ZeroGKDD'24linkLLMZeroShot/ZeroG
GNNLLMEmb\text{GNN}_{\text{LLMEmb}}-Ours ProposedLLMEncoder/GNN
LLM Instruction Tuning-Ours ImplementedLLMPredictor/Instruction Tuning
Direct Inference-Ours ImplementedLLMZeroShot/Direct

๐Ÿ“ฎ Contact

If you have any further questions about usage, reproducibility, or would like to discuss, please feel free to open an issue or contact the authors via email at xxwu@se.cuhk.edu.hk.

๐Ÿ™ Acknowledgements

We thank the authors of TAPE, ENGINE, GraphGPT, LLaGA, and ZeroG for their open-source implementations. Part of our framework is inspired by GLBench.