BERT4ETH (PyTorch Version)
September 18, 2025 ยท View on GitHub
This is the PyTorch implementation for the paper BERT4ETH: A Pre-trained Transformer for Ethereum Fraud Detection, accepted by the ACM Web conference (WWW) 2023.
The implementation of fine-tuning on phishing data was finished by Qiustander (https://github.com/Qiustander). Thanks a lot!
If you find this repository useful, please give us a star and cite our paper : ) Thank you!
Getting Start
Requirements
PyTorch > 1.12.0
Preprocess dataset
Step 1: Download dataset from Google Drive.
-
Transaction Dataset:
Step 2: Unzip dataset under the directory of "BERT4ETH/Data/"
cd BERT4ETH_PyTorch/data; # Labels are already included
unzip ...;
Pre-training
Step 1: Transaction Sequence Generation
cd src;
python gen_seq.py --bizdate=bert4eth_exp
Step 2: Pre-train BERT4ETH
python run_pretrain.py --bizdate="bert4eth_exp" \
--ckpt_dir="bert4eth_exp"
Step 3: Output Representation
python run_embed.py --bizdate="bert4eth_exp" \
--init_checkpoint="bert4eth_exp/xxx.pth"
Evaluation
Phishing Account Detection
cd eval
python phish_detection_mlp.py --input_dir="../outputs/xxx"
De-anonymization (ENS dataset)
python run_dean_ENS.py --metric=euclidean \
--init_checkpoint=bert4eth_exp/model_104000
Fine-tuning for phishing account detection
Will update later..
Citation
@inproceedings{hu2023bert4eth,
title={BERT4ETH: A Pre-trained Transformer for Ethereum Fraud Detection},
author={Hu, Sihao and Zhang, Zhen and Luo, Bingqiao and Lu, Shengliang and He, Bingsheng and Liu, Ling},
booktitle={Proceedings of the ACM Web Conference 2023},
pages={2189--2197},
year={2023}
}
Q&A
If you have any questions, you can either open an issue or contact me (sihaohu@gatech.edu), and I will reply as soon as I see the issue or email.