FairGKD

April 10, 2024 · View on GitHub

A PyTorch implementation of "The Devil is in the Data: Learning Fair Graph Neural Networks via Partial Knowledge Distillation".

Overview

GNNs have been shown to be unfair as they tend to make discriminatory decisions toward certain demographic groups, divided by sensitive attributes such as gender and race. While recent works have been devoted to improving their fairness performance, they often require accessible demographic information. This greatly limits their applicability in real-world scenarios due to legal restrictions. To address this problem, we present a demographic-agnostic method to learn fair GNNs via knowledge distillation, namely FairGKD. FairGKD is motivated by our empirical observation on partial data training.

FairGKD consists of a synthetic teacher and a GNN student model denoted by ftf_{t} and fsf_{s}, respectively. fsf_{s} is a GNN classifier for the node classification task, mimicking the output of ftf_{t}. The synthetic teacher ftf_{t} aims to distill fair and informative knowledge HH for the student model. Specifically, ftf_{t} is comprised of two fairness experts, ftmf_{tm} and ftgf_{tg}, and a projector ftpf_{tp}. Here, ftmf_{tm} and ftgf_{tg}, which are trained on only node attributes and only topology, alleviate higher-level biases without requiring access to sensitive attributes. Due to partial data training, ftmf_{tm} and ftgf_{tg} may generate fair yet uninformative node representations denoted by HtmH_{tm}, HtgH_{tg}. To bridge this gap, the projector ftpf_{tp} is used to combine these uninformative representations and performs mapping to generate informative representation HH. HH will be regarded as additional supervision to assist the learning of fsf_{s}. Mimicking fair and informative representation HH, fsf_{s} tends to generate fair node representation while preserving utility.

Requirements

  • python==3.7.9
  • numpy==1.21.6
  • torch==1.13.1
  • torch-cluster==1.5.9
  • torch_geometric==2.0.4
  • torch-scatter==2.0.6
  • torch-sparse==0.6.9
  • CUDA 11.7

Reproduction

To reproduce our results, please run:

bash run.sh