README.md

December 30, 2025 · View on GitHub

STH-SepNet: Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs

Welcome to STH-SepNet's GitHub repository! This repository hosts the code, data and model weight of STH-SepNet (KDD'25 Research Track).

Abstract: Spatio-temporal prediction is a pivotal task with broad applications in traffic management, climate monitoring, and energy scheduling. However, existing methodologies often struggle to balance model expressiveness and computational efficiency, especially when scaling to large real-world datasets. To tackle these challenges, we propose STH-SepNet (Spatio-Temporal Hypergraph Separation Network), a novel framework that decouples temporal and spatial modeling to enhance both efficiency and precision. Therein, the temporal dimension is modeled using lightweight large language models, which effectively capture low-rank temporal dynamics. Concurrently, the spatial dimension is addressed through an adaptive hypergraph neural network, which dynamically constructs hyperedges to model intricate, higher-order interactions. A carefully designed gating mechanism is integrated to seamlessly fuse temporal and spatial representations. By leveraging the fundamental principles of low-rank temporal dynamics and spatial interactions, STH-SepNet offers a pragmatic and scalable solution for spatio-temporal prediction in real-world applications. Extensive experiments on large-scale real-world datasets across multiple benchmarks demonstrate the effectiveness of STH-SepNet in improving predictive performance while maintaining computational efficiency. This work may provide a promising lightweight framework for spatio-temporal prediction, aiming to reduce computational demands and while enhancing predictive performance.

[Paper Page] [中文解读]

Citation

If you find this repository helpful for your research, please cite our paper.

@inproceedings{chen2025decoupling,
  title={Decoupling Spatio-Temporal Prediction: When Lightweight Large Models Meet Adaptive Hypergraphs},
  author={Chen, Jiawen and Shao, Qi and Chen, Duxin and Yu, Wwenwu},
  booktitle={Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD)},
  year={2025},
  month={August 3rd-7th},
  address={Toronto, Canada},
  publisher={ACM}
}

1. Preparation

1.1 Environment

The lightweight training requires torch 2.0+, to install all dependencies , update corresponding libraries:

pip install -r requirements.txt

1.2 Data

The data can be obtained and downloaded from (Google Drive), and makedir path dataset/ and put dataset in dataset/.

1.3 Large Language Models

The pretrained models can be downloaded from the links in the Table as below, and makedir path huggingface/ and put pretrained models in huggingface/. For example, huggingface/BERT

Model 🤗ParametersLLM DimensionModel Description
BERT110M768A Transformer-based pre-trained model for NLP tasks, excelling in sentence classification and question answering.
GPT-2124M768A Transformer-based generative model, specialized in text generation and language modeling.
GPT-37580M4096A large-scale Transformer-based generative model supporting various language tasks.
LLAMA-1B1230M2048A multilingual model developed by Meta, designed for dialogue and knowledge retrieval tasks.
LLAMA-7B6740M4096A multilingual model developed by Meta, suitable for various natural language generation tasks.
LLAMA-8B8000M4096A multilingual model developed by Meta, focused on dialogue and instruction-tuning tasks.
DeepSeek-Qwen1.5B1500M1536A reasoning-focused model enhanced through reinforcement learning for improved reasoning capabilities.

2. Main Results

2.1 Training Preparation

2.1.1 Download datasets and place them under ./dataset.

2.1.2 Download pretrained models and place them under ./huggingface.

2.1.3 Complete list of parameters

ParameterTypeDescriptionDefault Value
modelstringName of the model, among:
- pool: SHT-SepNet model with adaptive hypergraphs module
- Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting (NeurIPS 2021)
- TIMELLM: Time Series Forecasting by Reprogramming Large Language Models (ICLR 2024)
pool
datasetstringName of the dataset, among:
- inflow: Bike traffic flow inflow
- outflow: Bike traffic flow outflow
- PEMS03: California Highway network PeMS traffic flow dataset
- BJ: Traffic dataset of road network in some areas of Beijing
- METR:Traffic sensor data in the Los Angeles area
You can also specify any additional graph dataset, in edgelist format, by editing data_loader.py
inflow
node_numintthe node number of the network
-Inflow, Outflow: 295
-PEMS03:358
-BJ:500
-METR: 207
295
featuresstringforecasting task, options:[M, S, MS], among:
- M: multivariate predict multivariate
- S: univariate predict univariate
- MS:multivariate predict univariate
M
llm_modelstringLLM model: BERT,GPT2,GPT3,LLAMA1b,LLAMA7b,LLAMA8b, deepseek2b BERT
staticboolWhether to use static adjacency matrix moduleFalse
gcn_trueboolWhether to use GCN moduleTrue
adaptive_hyperhgnnstringHypergraph neuron network: hgcn, hgat, hsage'hgcn'
hgcn_trueboolWhether to use HGCN moduleTrue
temporal_trueboolWhether to use Temporal convolutional networks ModuleTrue
fusion_gatestringStyle of module fusion:
- adaptive:dynamically adjusts the weight of time and spatial features,
-attentiongate:considers the internal relationship between the two features
-lstmgate:captures the dependence of space on temporal features
-hyperstgnn :fully integrated adaptive hypergraph spatio-temporal prediction(without LLMs)
adaptive
llm_dimintLLM model dimension
- BERT, GPT2: 768
-LLAMA7b,LLAMA8b,GPT3: 4096
- LLAMA1b: 2048
- deepseek2b:1536
768
seq_lenintinput sequence length48
label_lenintstart token length48
pred_lenintprediction sequence length48
enc_inintencoder input size (e.g, Node num)295
dec_inintdecoder input size (e.g, Node num)295
c_outintoutput size (e.g, Node num)295
d_modelintdimension of model32
n_headsintnum of heads16
e_layersintnum of encoder layers2
d_layersintnum of decoder layers1
d_ffintdimension of fcn32
llm_layersintnum of llm layer6
train_epochsintNumber of training epochs50
align_epochsintNumber of alignment epochs10
alphafloatAdjustable parameter to control hyperSTLLM or STLLM0.1
betafloatAdjustable parameter to control hyperSTLLM or STLLM0.2
gammafloatAdjustable parameter to control hyperSTLLM or STLLM0.5
thetafloatAdjustable parameter to control hyperSTLLM or STLLM0.2

2.2 Training STH-SepNet

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike_order.sh
sh ./scripts/BIKE/GPT2_Bike_order.sh
sh ./scripts/BIKE/GPT3_Bike_order.sh
sh ./scripts/BIKE/LLAMA1B_Bike_order.sh
sh ./scripts/BIKE/LLAMA7B_Bike_order.sh
sh ./scripts/BIKE/LLAMA8B_Bike_order.sh
sh ./scripts/BIKE/Deepseek_Bike_order.sh

2.3 Training STH-SepNet-GNN

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike.sh
sh ./scripts/BIKE/GPT2_Bike.sh
sh ./scripts/BIKE/GPT3_Bike.sh
sh ./scripts/BIKE/LLAMA1B_Bike.sh
sh ./scripts/BIKE/LLAMA7B_Bike.sh
sh ./scripts/BIKE/LLAMA8B_Bike.sh
sh ./scripts/BIKE/Deepseek_Bike.sh

3. Ablation Study

3.1 STH-SepNet-without LLMs (w/o) (Undecoupled Version)

For example, to evaluate on BIKE datasets, Set --fusion_gate as hyperstgnn. Note that fully integrated adaptive hypergraph spatio-temporal prediction(without LLMs)

3.2 STH-SepNet-Mixorder

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike_mixorder3.sh
sh ./scripts/BIKE/GPT2_Bike_mixorder3.sh
sh ./scripts/BIKE/GPT3_Bike_mixorder3.sh
sh ./scripts/BIKE/LLAMA1B_Bike_mixorder3.sh
sh ./scripts/BIKE/LLAMA7B_Bike_mixorder3.sh
sh ./scripts/BIKE/LLAMA8B_Bike_mixorder3.sh
sh ./scripts/BIKE/Deepseek_Bike_mixorder3.sh

3.3 STH-SepNet-Effective Order on Adaptive Hypergraph

Run scripts for demonstration purpose under the folder ./scripts. For example, to evaluate on BIKE datasets by:

sh ./scripts/BIKE/BERT_Bike_Outflow_flexible_order3.sh
sh ./scripts/PEMS/BERT_PEMS03_flexible_order.sh

3.4 STH-SepNet-Fusion Mechanism between Spatio and Temporal Features

The fusion mechanism can be specified using the --fusion_gate argument. The available options are:

  • adaptive: Dynamically adjusts the weight of time and spatial features.
  • attentiongate: Considers the internal relationship between the two features.
  • lstmgate: Captures the dependence of space on temporal features.
  • hyperstgnn: Fully integrated adaptive hypergraph spatio-temporal prediction (without LLMs).

4. Performance and Visualization

Further Reading

Our research baselines models refer to the following works and their repository code.

STG4Traffic: {A} Survey and Benchmark of Spatial-Temporal Graph Neural Networks for Traffic Prediction. [Paper][Code].

@article{DBLP:journals/corr/abs-2307-00495,
  author       = {Xunlian Luo and Chunjiang Zhu and Detian Zhang and Qing Li},
  title        = {STG4Traffic: {A} Survey and Benchmark of Spatial-Temporal Graph Neural
                  Networks for Traffic Prediction},
  journal      = {CoRR},
  volume       = {abs/2307.00495},
  year         = {2023}
}

Deep Time Series Models: A Comprehensive Survey and Benchmark. [Paper][Code].

@article{wang2024tssurvey,
  title={Deep Time Series Models: A Comprehensive Survey and Benchmark},
  author={Yuxuan Wang and Haixu Wu and Jiaxiang Dong and Yong Liu and Mingsheng Long and Jianmin Wang},
  booktitle={arXiv preprint arXiv:2407.13278},
  year={2024},
}