[KDD-24] ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation
January 18, 2025 · View on GitHub
[News 202409] Our work has received a lot of feedbacks from the KDD community in Barcelona. Many thanks for your suggestions!
[News 202408] Our work has reported by “时空探索之旅”. Feel free to discuss it with us!
‼️[News 202407][PyPOTS implementation of ImputeFormer is now available inside PyPOTS, an open-source Python library for reality-centric machine/deep learning and data mining on partially-observed time series. The ecosystem of PyPOTS supports a comprehensive benchmark of imputation tasks.
‼️Our official implementation based on Torch Spatiotemporal is available now!!
Imputeformer in a nutshell
Our motivation: (a) The distribution of singular values in spatiotemporal data is long-tailed. The existence of missing data can increase its rank (or singular values). (b) Low-rank models can filter out informative signals and generate a smooth reconstruction, resulting in truncating too much energy in the left part of its spectrum. (c) Deep models can preserve high-frequency noise and generate sharp imputations, maintaining too much energy for the right part of the singular spectrum. With the generality of low-rank models and the expressivity of deep models, ImputeFormer achieves a signal-noise balance for accurate imputation.
Directory structure
The directory is structured as follows:
.
├── config/
│ ├── imputation/
│ │── Imputeformer.yaml
│ │── brits.yaml
│ │── grin.yaml
│ │── saits.yaml
│ │── spin.yaml
│ └── transformer.yaml
├── experiments/
│ └── run_spatiotemporal_imputation.py
├── Imputeformer/
│ ├── baselines/
│ ├── imputers/
│ ├── layers/
│ ├── models/
│ └── ...
├── conda_env.yaml
└── tsl_config.yaml
Get Started
‼️ Note that we now include the modified tsl package in this repository so there is no need to install tsl Following the instructions in SPIN and tsl, the project dependencies can be installed:
conda env create -f conda_env.yml
conda activate imputeformer
Run experiments
The experiment scripts are in the experiments folder.
-
run_imputation.pyis used to run models including both ImputeFormer and baselines. An example of usage isconda activate imputeformer python ./experiments/run_spatiotemporal_imputation.py --config imputation/imputeformer_la.yaml --model-name imputeformer --dataset-name la_block -
run_inference.pyis used for inference only using pre-trained models. An example of usage isconda activate imputeformer python ./experiments/run_inference.py --config inference.yaml --model-name imputeformer --dataset-name la_point --exp-name {exp_name}
Bibtex reference
If you find this code useful please consider to cite our paper:
@inproceedings{10.1145/3637528.3671751,
author = {Nie, Tong and Qin, Guoyang and Ma, Wei and Mei, Yuewen and Sun, Jian},
title = {ImputeFormer: Low Rankness-Induced Transformers for Generalizable Spatiotemporal Imputation},
year = {2024},
isbn = {9798400704901},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3637528.3671751},
doi = {10.1145/3637528.3671751},
booktitle = {Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
pages = {2260–2271},
numpages = {12},
keywords = {data imputation, low-rank modeling, missing data, spatiotemporal data, time series, transformers},
location = {Barcelona, Spain},
series = {KDD '24}
}
Acknowledgement
We acknowledge SPIN for providing a useful benchmark tool and training pipeline and TorchSpatiotemporal for helpful model implementations.