README.md

March 3, 2026 · View on GitHub

RootPurge

Characteristic Root Analysis and Regularization for Linear Time Series Forecasting

Zheng Wang:email:, Kaixuan Zhang, Wanfang Chen, Xiaonan Lu, Longyuan Li, Tobias Schlagenhauf

Bosch (China) Investment Co., Ltd. & Robert Bosch GmbH

(:email:) corresponding author: david.wang3@cn.bosch.com

Accepted to ICLR 2026!

Paper  Code 

Updates

🚩 Jan. 2026: Our paper is accepted to ICLR 2026!

🚩 Sep. 2025: We released our paper on arXiv. Code is available at GitHub.

Table of Contents

Introduction

Time series forecasting is critical across domains like weather, energy, and finance, yet no single model dominates all settings. Recent studies show that simple linear models can match or outperform complex deep learning architectures, raising the question: what makes linear models work so well, and how can we make them even better?

This paper provides a systematic theoretical analysis of linear time series models through the lens of characteristic roots—the fundamental quantities governing temporal dynamics in linear systems (trends, oscillations, decay). We show that:

  1. In noise-free settings, characteristic roots fully determine a model's expressive power. Common practices like instance normalization and channel-independent modeling naturally arise from this framework.
  2. In noisy settings, models learn spurious roots from noise, and overcoming this requires disproportionately large training data (a key data-scaling property: convergence rate is only O(1/T)\mathcal{O}(1/\sqrt{T})).
  3. This motivates structural regularization to suppress spurious dynamics without relying on massive datasets.

Structure of the paper and its main contributions.

Method

We propose two complementary strategies for robust root restructuring to suppress noise-induced spurious dynamics:

1. Rank Reduction Methods

Since noise inflates the effective rank of learned weight matrices, enforcing low-rank structure filters out noise-dominated components while preserving the true signal subspace.

  • Reduced-Rank Regression (RRR): Computes the OLS solution, then projects the forecast outputs onto a lower-dimensional subspace via truncated SVD. Provides a closed-form solution with easy rank adjustment—no retraining needed.
  • Direct Weight Rank Reduction (DWRR): Applies truncated SVD directly to the trained weight matrix as a post-processing step. Computationally efficient and applicable to any pre-trained linear model.

2. Root Purge (Novel Adaptive Regularization)

Root Purge is a training-integrated method that adaptively learns a noise-suppressing null space. The loss consists of two terms:

  • Root-seeking term: Standard prediction loss that fits the underlying signal dynamics.
  • Root-purging term: Feeds the residual (estimated noise) back through the model and penalizes non-zero output, encouraging the model to map noise into its null space.

Intuition: Through the rank-nullity theorem, expanding the null space reduces rank, achieving adaptive denoising during training. Root Purge works in both time and frequency domains and has only one hyperparameter (λ), which is robust across a wide range of values.

Main Results

Both RRR and Root Purge consistently outperform state-of-the-art baselines across standard benchmarks, with Root Purge achieving 13 first-place finishes out of 24 settings and RRR achieving 9.

Forecasting MSE for horizon H ∈ {96, 192, 336, 720} with lookback window L = 720:

Methods are especially effective on smaller datasets where models relying solely on data scaling tend to underperform.

Key Properties

Singular Value Shrinkage

Root Purge progressively shrinks small singular values (noise-related) while preserving large ones (signal-related), achieving implicit rank reduction during training.

Data Scaling & Noise Robustness

On synthetic data, RRR and Root Purge maintain strong performance even with limited training data or high noise levels, while baseline models degrade significantly.

Characteristic Root Recovery

In controlled synthetic experiments, roots estimated by RRR and Root Purge are significantly closer to ground-truth characteristic roots compared to standard OLS:

ModelRoot Distance to Ground Truth (mean ± std)
RRR0.036 ± 0.014
Root Purge0.045 ± 0.009
Standard Linear (OLS)0.064 ± 0.025

Visualization of roots shows similar results:

Hyperparameter Robustness

Root Purge improves performance across a wide range of λ values, making it easy to tune in practice.

Getting Started

Environment Setup

Create the conda environment and install dependencies (recommended):

conda create -n timeseries python=3.9 -y
conda activate timeseries
conda install pytorch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 pytorch-cuda=12.1 cuda-version=12.4 -c pytorch -c nvidia -y
conda install numpy=1.26.0 -y
conda install timm=1.0.12 -c conda-forge -y
pip install -r requirements.txt

Project Structure (high-level)

  • RootPurge/ — core RootPurge implementation, model backbones, and utilities.
  • Rank_Reduction/ — code for post-training rank reduction methods, including both RRR and DWRR.

Running experiments

  • Reproduce Rank-Reduction experiments (examples): run the top-level scripts:

    • python run_RRR.py — run post-train Rank Reduction experiments.
    • python run_DWRR.py — run the DWRR variant experiments.
  • Shell-run scripts for common experiment setups are in run_scripts/:

    • e.g. run_scripts/run_rootpurge_speclin_logC.sh and other specialized runners.

Examples (from project root):

# simple run (adjust args inside the script or call python files directly)
cd Rank_Reduction
python run_RRR.py
python run_DWRR.py

# run a prepared shell script (make executable if needed)
# You may need to prepend the python path as shown in the example scripts
cd RootPurge
bash run_scripts/run_rootpurge_speclin_logC.sh

Contacts

If you find issues or want to contribute, please open an issue or a pull request on the repository. For direct questions, please contact Zheng Wang at david.wang3@cn.bosch.com.

Acknowledgement

This work builds upon insights from classical linear systems theory and modern time series forecasting. We gratefully acknowledge the open-source datasets and codebases used in our experiments, including Time-Series-Library, ETT, PatchTST, DLinear, FITS, SparseTSF, and FilterNet.

Citation

If you find this repo useful in your research, please consider citing our paper as follows:

@inproceedings{
wang2026characteristic,
title={Characteristic Root Analysis and Regularization for Linear Time Series Forecasting},
author={Zheng Wang and Kaixuan Zhang and Wanfang Chen and Xiaonan Lu and Longyuan Li and Tobias Schlagenhauf},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://arxiv.org/abs/2509.23597}
}