COSA: Context-aware Output-Space Adapter for Test-Time Adaptation in Time Series Forecasting

March 4, 2026 · View on GitHub

ICLR 2026 Paper License: MIT

Official implementation of COSA (Context-aware Output-Space Adapter), accepted at ICLR 2026.

Abstract

Test-time adaptation (TTA) enables pre-trained time series forecasting models to adapt to evolving data distributions without accessing the original training data. While recent work introduces separate input and output adapters for online refinement, such dual-adapter designs can overfit to transient patterns and slow down inference. We propose COSA, a lightweight Context-aware Output-Space Adapter that refines predictions with a single, streamlined module. COSA applies a linear residual correction controlled by a learnable gating mechanism, incorporating only recently observed ground-truth statistics as context. Extensive experiments on six benchmark datasets across six forecasting architectures show that COSA consistently improves accuracy by 13.91–17.03% over non-adaptive baselines and 10.48–13.05% over current state-of-the-art TTA methods, while achieving 88.59–90.10% faster inference.

Overview

Key Contributions

  • Architecture-agnostic Output Adapter: A single output-space adapter that works with any base forecasting model without modification
  • Context-aware Linear Residual: Leverages recent ground-truth statistics to compute adaptive corrections
  • Learnable Gating Mechanism: Controls the magnitude of corrections via tanh(g) gating for stable adaptation
  • Fast & Efficient: Achieves 88.59–90.10% faster inference compared to dual-adapter TTA methods

Methodology

COSA refines the base model prediction Y^(0) through:

Ŷ_t = Y^(0)_t + tanh(g) · H_t

where:

  • H_t = W · X^(a)_t + b: Linear transformation of augmented input
  • X^(a)_t = [Y^(0)_t; C_t]: Concatenation of base prediction and context vector
  • C_t: Context vector from recent ground-truth statistics
  • tanh(g): Learnable gating parameter

Additional Components

  • PAAS (Periodicity-Aware Adaptive Scheduling): Dynamically adjusts batch size based on detected periodicity using FFT
  • CALR (Cosine-Adaptive Learning Rate): Adaptive learning rate schedule for stable online adaptation

Requirements

pip install -r requirements.txt
  • Python >= 3.8
  • PyTorch >= 1.10
  • CUDA (recommended for GPU acceleration)

Datasets

Download and place datasets in ./datasets/:

DatasetFrequencyVariablesTrain/Val/Test
ETTh1Hourly78545/2881/2881
ETTh2Hourly78545/2881/2881
ETTm115-min734465/11521/11521
ETTm215-min734465/11521/11521
Exchange RateDaily85120/665/1422
Weather10-min2136792/5271/10540

Supported Models

COSA is compatible with various forecasting architectures:

  • iTransformer
  • PatchTST
  • DLinear
  • OLS
  • FreTS
  • MICN

Usage

Training Base Models

bash scripts/train.sh

Running COSA (Test-Time Adaptation)

bash scripts/cosa.sh

Running Specific Experiments

python main.py \
  DATA.NAME ETTh1 \
  DATA.PRED_LEN 96 \
  MODEL.NAME iTransformer \
  TRAIN.ENABLE False \
  TRAIN.CHECKPOINT_DIR ./checkpoints/iTransformer/ETTh1_96/ \
  TTA.ENABLE True \
  TTA.SIMPLE.BATCH_SIZE 48 \
  TTA.SIMPLE.STEPS 3 \
  TTA.SIMPLE.BUFFER_CONTEXT_SIZE 10

Results

COSA achieves consistent improvements across all datasets and models:

Methodvs. Baselinevs. SOTA TTAInference Speed
COSA13.91–17.03% ↓10.48–13.05% ↓88.59–90.10% ↑

Lower is better for MSE/MAE. Higher is better for speed.

Citation

@inproceedings{
im2026cosa,
title={{COSA}: Context-aware Output-Space Adapter for Test-Time Adaptation in Time Series Forecasting},
author={Jeonghwan Im and Hyuk-Yoon Kwon},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=L7Z5wBMPrW}
}

License

This project is licensed under the MIT License. For commercial use, permission is required.

Acknowledgements

Please provide proper attribution if you use our codebase.