A Deep Reinforcement Learning Approach for Portfolio Management in Non-Short-Selling Market

Reinforcement learning (RL) has been applied to financial portfolio management in recent years. Current studies mostly focus on profit accumulation without much consideration of risk. Some risk-return balanced studies extract features from price and volume data only, which is highly correlated and m...

Full description

Saved in:
Bibliographic Details
Main Authors: Ruidan Su, Chun Chi, Shikui Tu, Lei Xu
Format: Article
Language:English
Published: Wiley 2024-01-01
Series:IET Signal Processing
Online Access:http://dx.doi.org/10.1049/2024/5399392
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Reinforcement learning (RL) has been applied to financial portfolio management in recent years. Current studies mostly focus on profit accumulation without much consideration of risk. Some risk-return balanced studies extract features from price and volume data only, which is highly correlated and missing representation of risk features. To tackle these problems, we propose a weight control unit (WCU) to effectively manage the position of portfolio management in different market statuses. A loss penalty term is also designed in the reward function to prevent sharp drawdown during trading. Moreover, stock spatial interrelation representing the correlation between two different stocks is captured by a graph convolution network based on fundamental data. Temporal interrelation is also captured by a temporal convolutional network based on new factors designed with price and volume data. Both spatial and temporal interrelation work for better feature extraction from historical data and also make the model more interpretable. Finally, a deep deterministic policy gradient actor–critic RL is applied to explore optimal policy in portfolio management. We conduct our approach in a challenging non-short-selling market, and the experiment results show that our method outperforms the state-of-the-art methods in both profit and risk criteria. Specifically, with 6.72% improvement on an annualized rate of return, 7.72% decrease in maximum drawdown, and a better annualized Sharpe ratio of 0.112. Also, the loss penalty and WCU provide new aspects for future work in risk control.
ISSN:1751-9683