A2J-Transformer
January 19, 2026 ยท View on GitHub
Introduction
This is the official implementation for the paper, "A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image", CVPR 2023.
Paper link here: A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image
๐ฅ๐ฅ๐ฅ The extension of our paper is now open! "3D Hand Pose Estimation via Articulated Anchor-to-Joint 3D Local Regressors", published in TPAMI 2026.
๐ฅ๐ฅ๐ฅ TPAMI 2026 Paper: Paper | Code | Project
๐ฅ๐ฅ๐ฅ CVPR 2023 Paper: Paper | Code
About our code
Updates
- (2025-1-13) Some of the training files of Hands2017 dataset are updated
here. - (2023-9-19) Training code released! All training codes for Interhand 2.6M, NYU and HANDS 2017 dataset.
- (2023-9-19) Updated
test.pyandbase.py, added the argparsetest_epoch. Updated the name of our pre-trained model, fromsnapshot.pth.tartosnapshot_0.pth.tar. - (2023-4-23) Deleted some compiled files in the
dab_deformable_detr\opsfolder.
Installation and Setup
Requirements
-
Our code is tested under Ubuntu 20.04 environment with NVIDIA 2080Ti GPU and NVIDIA 3090 GPU, both Pytorch1.7 and Pytorch1.11 work.
-
Python>=3.7
We recommend you to use Anaconda to create a conda environment:
conda create --name a2j_trans python=3.7Then, activate the environment:
conda activate a2j_trans -
PyTorch>=1.7.1, torchvision>=0.8.2 (following instructions here)
We recommend you to use the following pytorch and torchvision:
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch -
Other requirements
conda install tqdm numpy matplotlib scipy pip install opencv-python pycocotools
Compiling CUDA operators(Following Deformable-DETR)
cd ./dab_deformable_detr/ops
sh make.sh
Usage
Dataset preparation
-
Please download InterHand 2.6M Dataset and organize them as following:
your_dataset_path/ โโโ Interhand2.6M_5fps/ โโโ annotations/ โโโ images/
Testing on InterHand 2.6M Dataset
-
Please download our pre-trained model and organize the code as following:
a2j-transformer/ โโโ dab_deformable_detr/ โโโ nets/ โโโ utils/ โโโ ...py โโโ datalist/ | โโโ ...pkl โโโ output/ โโโ model_dump/ โโโ snapshot_0.pth.tarThe
datalistfolder and the pkl files denotes the dataset-list generated during running the code. -
In
config.py, setinterhand_anno_dir,interhand_images_pathto the dataset abs directory. -
In
config.py, setcur_dirto the a2j-transformer code directory. -
Run the following script:
python test.py --gpu <your_gpu_ids>You can also choose to change the
gpu_idsintest.py.
NYU and HANDS 2017 dataset
- Please use our code following A2J to prepare the datasets.
- If you have prepared, just run the
nyu.pyandhands2017.py.
Cite
Our code is protected by patents and cannot be used for commercial purposes. If you have commercial needs, please contact Prof. Yang Xiao (Huazhong University of Science and Technology): Yang_Xiao@hust.edu.cn.
If you find our work useful in your research or publication, please cite our work:
@inproceedings{jiang2023a2j,
title={A2J-Transformer: Anchor-to-Joint Transformer Network for 3D Interacting Hand Pose Estimation from a Single RGB Image},
author={Jiang, Changlong and Xiao, Yang and Wu, Cunlin and Zhang, Mingyang and Zheng, Jinghong and Cao, Zhiguo and Zhou, Joey Tianyi},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={8846--8855},
year={2023}
}