GTM: A General Time-series Model
February 25, 2026 ยท View on GitHub
GTM: A General Time-series Model for Enhanced Representation Learning of Time-Series Data
Published as a conference paper at ICLR 2026
GTM (General Time-series Model) is a foundation model for time series analysis that advances representation learning via a novel frequency-domain attention mechanism and a unified pre-training strategy. GTM is the first generative-task-agnostic model for time series, enabling seamless adaptation to various generative tasks without any task-specific modifications.
๐ Key Features
- Frequency-Domain Attention: Novel Fourier attention mechanism that captures time-granularity-aware features
- Hybrid Pre-training: Unified reconstruction and autoregressive objectives through hybrid masking
- Generative-Task-Agnostic: Seamless adaptation to forecasting, imputation, and anomaly detection without modifications
- Scalable Architecture: Follows scaling laws with performance improving as model size and pre-training data increase
- Multi-Granularity Support: Explicitly incorporates time granularity for robust representation learning
๐๏ธ Architecture Overview
GTM follows a decoder-only Transformer architecture with specialized components for time series modeling:
Core Components
-
Input Embedding:
- Reversible Instance Normalization (RevIN)
- Channel Independence (CI)
- Patching and masking
- Linear and positional embeddings
-
N-Stack Decoder-only Backbone:
- Temporal self-attention module
- Fourier attention module for frequency-domain information
- Decoder layers with residual connections
-
Fourier Attention Module:
- Captures frequency-specific patterns
- Time granularity-aware representations
- Five low-rank modules for different granularities
- Global frequency learning module
-
Output Projection:
- Unified linear projection layer
- Instance denormalization
๐ฌ Methodology
Frequency-Domain Attention Mechanism
GTM introduces a novel Fourier attention mechanism that:
- Transforms temporal patches to frequency domain using FFT
- Learns granularity-aware representations through specialized modules
- Combines multiple frequency patterns using attention weights
- Transforms back to temporal domain using inverse FFT
Hybrid Pre-training Strategy
Our pre-training framework unifies reconstruction and autoregressive objectives:
- Random Masking: Samples patch spans and randomly permutes them
- Consecutive Tail Masking: Applies controlled proportion of consecutive masks at sequence tail
- 2D Positional Encoding: Ensures model awareness of masked span lengths
- Span Shuffling: Enhances robustness and generalization
๐ Installation
# Clone the repository
git clone https://github.com/MMTS4All/GTM.git
cd GTM
# Create conda environment (recommended)
conda create -n gtm python=3.8
conda activate gtm
# Install dependencies
pip install -r requirements.txt
๐ Usage
Pre-training
python run_pretrain.py \
--task_name pre_train \
--model_id GTM \
--data utsd \
--root_path /path/to/your/data \
--seq_len 1440 \
--patch_len 96 \
--stride 96 \
--d_model 768 \
--d_layers 12 \
--batch_size 1024 \
--learning_rate 1e-5 \
--train_epochs 30
Fine-tuning for Forecasting
python run_forecasting.py \
--task_name long_term_forecast \
--model_id GTM \
--data ETTm1 \
--root_path ./data/ETT/ \
--data_path ETTm1.csv \
--seq_len 96 \
--label_len 48 \
--pred_len 96 \
--d_model 768 \
--d_layers 12 \
--batch_size 32 \
--learning_rate 1e-4
Supported Tasks
GTM supports multiple time series analysis tasks without architectural modifications:
- Long-term Forecasting: Predict future values in a time series
- Imputation: Fill in missing values in a time series
- Anomaly Detection: Identify anomalous patterns in time series data
- Pre-training: Learn general representations from large-scale time series data
Each task leverages the same model architecture but with different configurations and loss functions.
Long-term Forecasting
Predict future values in a time series using historical data. The model takes a sequence of past values and predicts future values for a specified horizon.
Imputation
Fill in missing values in a time series. The model is trained to reconstruct missing values by randomly masking portions of the input sequence during training.
Anomaly Detection
Identify anomalous patterns in time series data. The model is trained to reconstruct normal patterns and uses reconstruction error to detect anomalies.
Pre-training
Learn general representations from large-scale time series data using a hybrid masking strategy that combines random and consecutive tail masking.
๐ Performance
GTM consistently outperforms state-of-the-art models across various benchmarks:
| Task | Dataset | GTM MSE | SOTA MSE | Improvement |
|---|---|---|---|---|
| Forecasting | ETTh1 | 0.404 | 0.411 | 1.7% |
| Forecasting | ETTm1 | 0.339 | 0.350 | 3.1% |
| Imputation | ETTh1 | 0.053 | 0.055 | 3.6% |
| Anomaly Detection | MSL | 82.53 | 81.92 | +0.61 |
๐ Citation
If you find this work useful, please cite our paper:
@inproceedings{he2026gtm,
title={GTM: A General Time-series Model for Enhanced Representation Learning of Time-Series Data},
author={He, Cheng and Huang, Xu and Jiang, Gangwei and Li, Zhaoyi and Lian, Defu and Xie, Hong and Chen, Enhong and Liang, Xijie and Zheng, Zengrong and Lee, Patrick P. C.},
booktitle={International Conference on Learning Representations},
year={2026}
}
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
Made with โค๏ธ by the GTM Team