Fourier-mixed window attention for efficient and robust long sequence time-series forecasting

We study a fast local-global window-based attention method to accelerate Informer for long sequence time-series forecasting (LSTF) in a robust manner. While window attention being local is a considerable computational saving, it lacks the ability to capture global token information which is compensa...

Full description

Saved in:
Bibliographic Details
Main Authors: Nhat Thanh Tran, Jack Xin
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-05-01
Series:Frontiers in Applied Mathematics and Statistics
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fams.2025.1600136/full
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We study a fast local-global window-based attention method to accelerate Informer for long sequence time-series forecasting (LSTF) in a robust manner. While window attention being local is a considerable computational saving, it lacks the ability to capture global token information which is compensated by a subsequent Fourier transform block. Our method, named FWin, does not rely on query sparsity hypothesis and an empirical approximation underlying the ProbSparse attention of Informer. Experiments on univariate and multivariate datasets show that FWin transformers improve the overall prediction accuracies of Informer while accelerating its inference speeds by 1.6 to 2 times. On strongly non-stationary data (power grid and dengue disease data), FWin outperforms Informer and recent SOTAs thereby demonstrating its superior robustness. We give mathematical definition of FWin attention, and prove its equivalency to the canonical full attention under the block diagonal invertibility (BDI) condition of the attention matrix. The BDI is verified to hold with high probability on benchmark datasets experimentally.
ISSN:2297-4687