IMTS is Worth Time × Channel Patches: Visual Masked Autoencoders for Irregular Multivariate Time Series Prediction

May 31, 2025 · View on GitHub

The official implementation of ICML 25 paper IMTS is Worth Time × Channel Patches: Visual Masked Autoencoders for Irregular Multivariate Time Series Prediction. [pdf] [poster] [arXiv]

TL;DR

The core idea is to adapt the powerful capabilities of pre-trained visual MAEs to IMTS data via self-supervised latent space mask reconstruction for enhanced performance and few-shot capability.

The overall architecture of VIMTS. The irregularly sampled data in each channel is divided into sections with equal-intervals along the timeline. Each section undergoes intra-section feature extraction using Time-aware Convolutional Network (TTCN) and cross-channel information compensation via Graph Convolutional Networks (GCNs). These compensated patches are then fed into a pre-trained MAE for patch reconstruction, thereby modeling temporal dependencies among patches within each channel. Finally, a coarse-to-fine technique gradually generates precise predictions from patch-level to point-level. The training encompasses two stages. First, self-supervised learning aims to improve IMTS modeling by adapting the capabilities of the visual pre-trained MAE to IMTS data. Second, the supervised fine-tuning is employed to enhance forecasting performance.

Requirements

conda create -n vimts python=3.9.20

conda activate vimts

bash install.sh

Datasets Preparation

For Physionet, Human Activity, and USHCN, we utilize totally the same preprocessing pipeline as the same as t-PatchGNN. And here we provide the processed data files in /IMTS/data/.

Until the acceptance of this paper, for MIMIC, because of the PhysioNet Credentialed Health Data License and the issues in preprocessing scripts of t-PatchGNN , we have to apply for a permission for raw data through this link. And then sequentially execute the corrected notebooks admissions.ipynb, output.ipynb, labevents.ipynb and prescriptions.ipynb and mimic_prep.ipynb to get full_dataset.csv, then the python scripts can automatically process it.

Run the Model

First, cd to the IMTS directory and run the following command:

cd path/to/this/repo/IMTS

Second, run the scripts to conduct experiments on ushcn/physionet/activity/mimic dataset.

Self-Supervised Learning Stage:
```
bash scripts/{$dataset}/pretrain.sh
```
Fine-Tune Stage based on logs/physionet/pretrain/pretrain_weights/xxxx.pth (manually fill in finetune.sh):
```
bash scripts/{$dataset}/finetune.sh
```

Parameter Explanation

python run_models.py 
    --model VisionTS-B \      # The model size we choose as backbone
    --dataset ushcn \         # The dataset name
    --state 'def' \
    --history 24 \            # The length of history data
    --patience $patience \    # The patience of earlystopping
    --batch_size 64 \         # The batch size of training and inference
    --lr 1e-4 \               # The learning rate
    --patch_size 1 \          # The temporal window size for each patch
    --stride 1 \              # The temporal stride size for each sliding operation
    --nhead 1 \
    --tf_layer 1 \ 
    --nlayer 3 \              # The number of GCN layers
    --te_dim 10 \             # The dimension of time embddings
    --node_dim 10 \           # The dimension of GCN vertex embeddings
    --hid_dim 32 \            # The length of hidden features before MAE
    --train_mode fine-tune \  # [pre-train, fine-tune]
    --finetune_type norm \    # Finetune types selected from [norm, wo_ssl_ft, attn, mlp, freeze, all] 
    --mask_ratio 0.4 \        # Mask ratio for mask-reconstruction self-supervised learning
    --few_shot_ratio None \   # fewshot ratio from [None, 0.5, 0.2, 0.1], None means training with full data
    --seed 1 \                # seed for models
    --gpu 1 \                 # gpu index for use
    --log_dir finetune \      # log dir name defined by yourself

Logs of Main Experiments

/data/hzy/lvm4-ts/IMTS/logs/

Citation

@inproceedings{zhangvimts2025,
  title={IMTS is Worth Time $\times$ Channel Patches: Visual Masked Autoencoders for Irregular Multivariate Time Series Prediction},
  author={Zhangyi, Hu and Jiemin, Wu and Hua, Xu and Mingqian, Liao and Ninghui, Feng and Bo, Gao and Songning, Lai and Yutao, Yue },
  booktitle={ICML},
  year={2025}
}

Connecting

zhangyi_hu@whu.edu.cn

Acknowledgement

Some components of this code implementation are adopted from VisionTS and t-PatchGNN. We sincerely appreciate for their contributions.