Pytorch Geometric Loader

April 25, 2026 · View on GitHub


Pytorch Geometric Loader

Pytorch Geometric (>=2.2.0) provides an easy-to-use dataset loader for DGraphFin. Below is an example in Pytorch Geometric with only a few lines of code to load DGraphFin and get the train/valid/test mask.

import torch_geometric

# check your torch_geometric version and make sure it is not lower than 2.2.0
print(torch_geometric.__version__)
>>> '2.2.0'

# Please download DGraphFin dataset file 'DGraphFin.zip' on our website 'https://dgraph.xinye.com' and place it under directory './dataset/raw'
# Otherwise an error would pop out "Dataset not found. Please download 'DGraphFin.zip' from 'https://dgraph.xinye.com' and move it to './raw' "
from torch_geometric.datasets import DGraphFin

dataset = DGraphFin(root='./dataset')
data = dataset[0]

data
>>> Data(x=[3700550, 17], edge_index=[2, 4300999], y=[3700550], edge_type=[4300999], edge_time=[4300999], train_mask=[3700550], val_mask=[3700550], test_mask=[3700550])

Note: Please download DGraphFin dataset file 'DGraphFin.zip' on our website 'https://dgraph.xinye.com' and place it under directory './dataset/raw' before running the example, otherwise an error would pop out "Dataset not found. Please download 'DGraphFin.zip' from 'https://dgraph.xinye.com' and move it to './raw' "

Baselines

This repo provides a collection of baselines of DGraphFin. Please download the dataset file on our website and place it under the folder './dataset/DGraphFin/raw'.

Environments

Implementing environment:

  • numpy = 1.21.2

  • pytorch = 1.6.0

  • torch_geometric = 1.7.2

  • torch_scatter = 2.0.8

  • torch_sparse = 0.6.9

  • GPU: Tesla V100 32G

Training

To get the performance for each model, simply run the following lines of code:

  • MLP
python gnn.py --model mlp --dataset DGraphFin --epochs 200 --runs 10 --device 0
  • GCN
python gnn.py --model gcn --dataset DGraphFin --epochs 200 --runs 10 --device 0
  • GraphSAGE
python gnn.py --model sage --dataset DGraphFin --epochs 200 --runs 10 --device 0
  • GraphSAGE (NeighborSampler)
python gnn_mini_batch.py --model sage_neighsampler --dataset DGraphFin --epochs 200 --runs 10 --device 0
  • GAT (NeighborSampler)
python gnn_mini_batch.py --model gat_neighsampler --dataset DGraphFin --epochs 200 --runs 10 --device 0
  • GATv2 (NeighborSampler)
python gnn_mini_batch.py --model gatv2_neighsampler --dataset DGraphFin --epochs 200 --runs 10 --device 0

Performances:

Below are the performances on DGraphFin(10 runs):

MethodsTrain AUCValid AUCTest AUC
MLP0.7221 ± 0.00140.7135 ± 0.00100.7192 ± 0.0009
GCN0.7108 ± 0.00270.7078 ± 0.00270.7078 ± 0.0023
GraphSAGE0.7682 ± 0.00140.7548 ± 0.00130.7621 ± 0.0017
GraphSAGE (NeighborSampler)0.7845 ± 0.00130.7674 ± 0.00050.7761 ± 0.0018
GAT (NeighborSampler)0.7396 ± 0.00180.7233 ± 0.00120.7333 ± 0.0024
GATv2 (NeighborSampler)0.7698 ± 0.00830.7526 ± 0.00890.7624 ± 0.0081

Datasets Overview

To advance research on graph foundation models and the evolution of dynamic graphs, we continue to release real-world graph datasets. Three large-scale datasets are currently publicly available, with details provided below.

Summary

NameEvolving Pattern ModelingGraph Anomaly DetectionPretraining of Dynamic Graph
DGraph-Fin
DGraph-Fin2
DGraph-Nearby
NameTypeTime UnitTime span# Node# EdgeIs DirectedNode labelEdge labelTemporal node label
DGraph-FinSocial NetworkDay~ 2 Years3,700,5504,300,999
DGraph-Fin2Social NetworkDay~ 2 Years3,700,5504,300,999
DGraph-NearbySocial NetworkDay~ 3 Years21,796190,332