A novel transformer-based dual attention architecture for the prediction of financial time series

Abstract Financial prediction has gained significant attention due to the complex and non-linear dynamics of the market. A promising approach for generating accurate predictions is Transformers. Encoder-decoder structures efficiently capture complex temporal dependencies and patterns within large-sc...

Full description

Saved in:
Bibliographic Details
Main Authors: Anita Hadizadeh, Mohammad Jafar Tarokh, Majid Mirzaee Ghazani
Format: Article
Language:English
Published: Springer 2025-06-01
Series:Journal of King Saud University: Computer and Information Sciences
Subjects:
Online Access:https://doi.org/10.1007/s44443-025-00045-y
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849331552845037568
author Anita Hadizadeh
Mohammad Jafar Tarokh
Majid Mirzaee Ghazani
author_facet Anita Hadizadeh
Mohammad Jafar Tarokh
Majid Mirzaee Ghazani
author_sort Anita Hadizadeh
collection DOAJ
description Abstract Financial prediction has gained significant attention due to the complex and non-linear dynamics of the market. A promising approach for generating accurate predictions is Transformers. Encoder-decoder structures efficiently capture complex temporal dependencies and patterns within large-scale data. However, relying on a single attention mechanism may limit the model’s ability to capture more intricate relationships. This paper proposes a dual attention architecture to improve the encoder-decoder framework for financial forecasting. First, the Price Attention Network (PAN) extracts complex features from price data and forecasts future prices using historical price inputs. Two key improvements are introduced to enhance self-attention: a Masked Self-Attention module focusing on the most relevant information and Multi-head Attention facilitating more profound insights into the data. Second, the Nonprice Attention Network (NAN) is proposed as a parallel network that processes related financial features. This network utilizes ConvLSTM, BiGRU, and Self-Attention to dynamically weigh and extract meaningful information from nonprice data. Finally, the PAN and NAN networks are integrated, enhancing prediction accuracy. The proposed approach outperforms five state-of-the-art models. Moreover, qualitative assessments of over 26 financial datasets, spanning large and small datasets with short and long histories, further validate the proposed model's ability. Evaluations using seven metrics show the model’s superiority, achieving a Mean Absolute Error (MAE) of 0.01991, Mean Squared Error (MSE) of 0.00084, Mean Pinball Loss (MPL) of 0.00996, Symmetric Mean Absolute Percentage Error (SMAPE) of 3.03324, and Mean Absolute Scaled Error (MASE) of 1.85436. This framework represents a significant advancement in financial prediction, offering accurate and interpretable forecasts across various time series tasks.
format Article
id doaj-art-45c5c82b1a1a4e9d839f977c6305de76
institution Kabale University
issn 1319-1578
2213-1248
language English
publishDate 2025-06-01
publisher Springer
record_format Article
series Journal of King Saud University: Computer and Information Sciences
spelling doaj-art-45c5c82b1a1a4e9d839f977c6305de762025-08-20T03:46:29ZengSpringerJournal of King Saud University: Computer and Information Sciences1319-15782213-12482025-06-0137513110.1007/s44443-025-00045-yA novel transformer-based dual attention architecture for the prediction of financial time seriesAnita Hadizadeh0Mohammad Jafar Tarokh1Majid Mirzaee Ghazani2Department of Industrial Engineering, K. N. Toosi University of TechnologyDepartment of Industrial Engineering, K. N. Toosi University of TechnologyDepartment of Industrial Engineering, K. N. Toosi University of TechnologyAbstract Financial prediction has gained significant attention due to the complex and non-linear dynamics of the market. A promising approach for generating accurate predictions is Transformers. Encoder-decoder structures efficiently capture complex temporal dependencies and patterns within large-scale data. However, relying on a single attention mechanism may limit the model’s ability to capture more intricate relationships. This paper proposes a dual attention architecture to improve the encoder-decoder framework for financial forecasting. First, the Price Attention Network (PAN) extracts complex features from price data and forecasts future prices using historical price inputs. Two key improvements are introduced to enhance self-attention: a Masked Self-Attention module focusing on the most relevant information and Multi-head Attention facilitating more profound insights into the data. Second, the Nonprice Attention Network (NAN) is proposed as a parallel network that processes related financial features. This network utilizes ConvLSTM, BiGRU, and Self-Attention to dynamically weigh and extract meaningful information from nonprice data. Finally, the PAN and NAN networks are integrated, enhancing prediction accuracy. The proposed approach outperforms five state-of-the-art models. Moreover, qualitative assessments of over 26 financial datasets, spanning large and small datasets with short and long histories, further validate the proposed model's ability. Evaluations using seven metrics show the model’s superiority, achieving a Mean Absolute Error (MAE) of 0.01991, Mean Squared Error (MSE) of 0.00084, Mean Pinball Loss (MPL) of 0.00996, Symmetric Mean Absolute Percentage Error (SMAPE) of 3.03324, and Mean Absolute Scaled Error (MASE) of 1.85436. This framework represents a significant advancement in financial prediction, offering accurate and interpretable forecasts across various time series tasks.https://doi.org/10.1007/s44443-025-00045-yFinancial market predictionTransformerBidirectional Gated Recurrent UnitsConvLSTM
spellingShingle Anita Hadizadeh
Mohammad Jafar Tarokh
Majid Mirzaee Ghazani
A novel transformer-based dual attention architecture for the prediction of financial time series
Journal of King Saud University: Computer and Information Sciences
Financial market prediction
Transformer
Bidirectional Gated Recurrent Units
ConvLSTM
title A novel transformer-based dual attention architecture for the prediction of financial time series
title_full A novel transformer-based dual attention architecture for the prediction of financial time series
title_fullStr A novel transformer-based dual attention architecture for the prediction of financial time series
title_full_unstemmed A novel transformer-based dual attention architecture for the prediction of financial time series
title_short A novel transformer-based dual attention architecture for the prediction of financial time series
title_sort novel transformer based dual attention architecture for the prediction of financial time series
topic Financial market prediction
Transformer
Bidirectional Gated Recurrent Units
ConvLSTM
url https://doi.org/10.1007/s44443-025-00045-y
work_keys_str_mv AT anitahadizadeh anoveltransformerbaseddualattentionarchitectureforthepredictionoffinancialtimeseries
AT mohammadjafartarokh anoveltransformerbaseddualattentionarchitectureforthepredictionoffinancialtimeseries
AT majidmirzaeeghazani anoveltransformerbaseddualattentionarchitectureforthepredictionoffinancialtimeseries
AT anitahadizadeh noveltransformerbaseddualattentionarchitectureforthepredictionoffinancialtimeseries
AT mohammadjafartarokh noveltransformerbaseddualattentionarchitectureforthepredictionoffinancialtimeseries
AT majidmirzaeeghazani noveltransformerbaseddualattentionarchitectureforthepredictionoffinancialtimeseries