Hyperbolic Graph Convolutional Auto-Encoders

June 29, 2023 ยท View on GitHub

Accepted to CVPR2021 :tada:

Official PyTorch code of Unsupervised Hyperbolic Representation Learning via Message Passing Auto-Encoders

Jiwoong Park*, Junho Cho*, Hyung Jin Chang, Jin Young Choi (* indicates equal contribution)

vis_cora Embeddings of cora dataset. GAE is Graph Auto-Encoders in Euclidean space, HGCAE is our method. P is Poincare ball, H is Hyperboloid.

Overview

This repository provides HGCAE code in PyTorch for reproducibility with

  • PoincareBall manifold
  • Link prediction task and node clustering task on graph data
    • 6 datasets: Cora, Citeseer, Wiki, Pubmed, Blog Catalog, Amazon Photo
    • Amazon Photo was downloaded via torch-geometric package.
  • Image clustering task on images
    • 2 datasets: ImageNet10, ImageNetDog
    • Image features extracted from ImageNet10, ImageNetDog with PICA image clustering algorithm
    • Mutual K-NN graph from the image features provided.
  • ImageNet-BNCR
    • We have constructed a new dataset, ImageNet-BNCR(Balanced Number of Classes across Roots), via randomly choosing 3 leaf classes per root. We chose three roots, Artifacts, Natural objects, and Animal. Thus, there exist 9 leaf classes, and each leaf class contains 1,300 images in ImageNet-BNCR dataset.
    • bncr

Installation Guide

We use docker to reproduce performance. Please refer guide.md

Usage

1. Run docker

Before training, run our docker image:

docker run --gpus all -it --rm --shm-size 100G -v $PWD:/workspace junhocho/hyperbolicgraphnn:8 bash

If you want to cache edge splits for train/val dataset and load faster afterwards, mkdir ~/tmp and run:

docker run --gpus all -it --rm --shm-size 100G -v $PWD:/workspace -v ~/tmp:/root/tmp junhocho/hyperbolicgraphnn:8 bash

2. train_<dataset>.sh

In the docker session, run each train shell script for each dataset to reproduce performance:

Run following commands to reproduce results:

  • sh script/train_cora_lp.sh
  • sh script/train_citeseer_lp.sh
  • sh script/train_wiki_lp.sh
  • sh script/train_pubmed_lp.sh
  • sh script/train_blogcatalog_lp.sh
  • sh script/train_amazonphoto_lp.sh
ROCAP
Cora0.948907030.94726805
Citeseer0.960594070.96305937
Wiki0.955108050.96200790
Pubmed0.962072120.96083080
Blog Catalog0.896839390.88651569
Amazon Photo0.982406730.97655753

Graph data node clustering

  • sh script/train_cora_nc.sh
  • sh script/train_citeseer_nc.sh
  • sh script/train_wiki_nc.sh
  • sh script/train_pubmed_nc.sh
  • sh script/train_blogcatalog_nc.sh
  • sh script/train_amazonphoto_nc.sh
ACCNMIARI
Cora0.746676510.572529400.55212928
Citeseer0.693116920.422492940.44101404
Wiki0.459459460.467778810.21517031
Pubmed0.748491150.377592620.40770875
Blog Catalog0.550615860.325573880.25227964
Amazon Photo0.781307190.696236510.60342107

Image clustering

  • sh script/train_ImageNet10.sh
  • sh script/train_ImageNetDog.sh
ACCNMIARI
ImageNet100.855923080.790191310.74181220
ImageNetDog0.387384620.360596500.22696503
  • At least 11GB VRAM is required to run on Pubmed, BlogCatalog, Amazon Photo.
  • We have used GTX 1080ti only in our experiments.
  • Other gpu architectures may not reproduce above performance.

Parameter description

  • dataset : Choose dataset. Refer to each training scripts.
  • c : Curvature of hypebolic space. Should be >0. Preferably choose from 0.1, 0.5 ,1 ,2.
  • c_trainable : 0 or 1. Train c if 1.
  • dropout : Dropout ratio.
  • weight_decay : Weight decay.
  • hidden_dim : Hidden layer dimension. Same dimension used in encoder and decoder.
  • dim : Embedding dimension.
  • lambda_rec : Input reconstruction loss weight.
  • act : relu, elu, tanh.
  • --manifold PoincareBall : Use Euclidean if training euclidean models.
  • --node-cluster 1 : If specified perform node clustering task. If not, link prediction task.

Acknowledgments

This repo is inspired by hgcn.

And some of the code was forked from the following repositories:

License

This work is licensed under the MIT License

Citation

@inproceedings{park2021unsupervised,
  title={Unsupervised Hyperbolic Representation Learning via Message Passing Auto-Encoders},
  author={Jiwoong Park and Junho Cho and Hyung Jin Chang and Jin Young Choi},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  year={2021}
}