An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators

The proposed motion cueing algorithm (MCA), based on a reinforcement learning algorithm using gradient information to directly update the control policy, introduces three significant enhancements. First, transform the complex simulator environment into a differentiable simulator environment that pro...

Full description

Saved in:
Bibliographic Details
Main Authors: Xiaowei Huang, Xuhua Shi, Peiyao Wang, Hongzan Xu, Xiaojun Tang, Gaoran Zhang
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10978024/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850031877904138240
author Xiaowei Huang
Xuhua Shi
Peiyao Wang
Hongzan Xu
Xiaojun Tang
Gaoran Zhang
author_facet Xiaowei Huang
Xuhua Shi
Peiyao Wang
Hongzan Xu
Xiaojun Tang
Gaoran Zhang
author_sort Xiaowei Huang
collection DOAJ
description The proposed motion cueing algorithm (MCA), based on a reinforcement learning algorithm using gradient information to directly update the control policy, introduces three significant enhancements. First, transform the complex simulator environment into a differentiable simulator environment that provides gradient information at each time step and use this gradient information to directly update the control policy. Second, the network architecture is reconfigured into a concurrent controller format, similar to Model Predictive Control (MPC). This controller processes a sequence of vehicle motion reference signals over a future period, utilizing a multi-layer perceptron to generate the simulator’s motion reference control signal sequences for the same duration. Unlike the online optimization employed in MPC, this algorithm as an offline optimization method, providing substantial computational advantages when integrated into the driving simulator. As the prediction horizon increases, the algorithm demonstrates superior computational efficiency, which helps reduce the incidence of motion sickness during the use of the driving simulator. Third, a loss function specifically designed for the motion simulator is proposed. This function incorporates constraints derived from the MPC framework to address workspace limitations and applies them to workspace management. These constraints restrict the platform’s acceleration and speed near the workspace boundaries, allowing for better utilization of the available space. The algorithm is validated using Carla’s autonomous driving simulation software as the dataset generator. During the training process, the proposed algorithm in this paper achieves an order-of-magnitude improvement in convergence speed compared to conventional training methods of PPO and DDPG. Simulations with a 10-step prediction horizon indicate that the Root Mean Square Error (RMSE) produced by this algorithm is comparable to that of the MCA based on MPC (MPC-MCA) and significantly lower than that of the MCA based classical washout (CW-MCA). At higher prediction horizons, the algorithm achieves performance on par with state-of-the-art MPC-based motion cueing algorithms while exhibiting reduced algorithmic delay. Additionally, the proposed algorithm delivers quicker results and improved tracking performance across all prediction horizons, ultimately surpassing the current state-of-the-art MPC-MCA.
format Article
id doaj-art-dea12f7c1c7845df8ddfff01c0e46757
institution DOAJ
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-dea12f7c1c7845df8ddfff01c0e467572025-08-20T02:58:51ZengIEEEIEEE Access2169-35362025-01-0113815078152310.1109/ACCESS.2025.356459710978024An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving SimulatorsXiaowei Huang0https://orcid.org/0009-0009-1480-8995Xuhua Shi1https://orcid.org/0000-0002-3012-7913Peiyao Wang2Hongzan Xu3Xiaojun Tang4Gaoran Zhang5Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, ChinaFaculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, ChinaFaculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, ChinaZeekr Automobile (Ningbo Hangzhou Bay New Zone) Company Ltd., Ningbo, ChinaZeekr Automobile (Ningbo Hangzhou Bay New Zone) Company Ltd., Ningbo, ChinaZeekr Automobile (Ningbo Hangzhou Bay New Zone) Company Ltd., Ningbo, ChinaThe proposed motion cueing algorithm (MCA), based on a reinforcement learning algorithm using gradient information to directly update the control policy, introduces three significant enhancements. First, transform the complex simulator environment into a differentiable simulator environment that provides gradient information at each time step and use this gradient information to directly update the control policy. Second, the network architecture is reconfigured into a concurrent controller format, similar to Model Predictive Control (MPC). This controller processes a sequence of vehicle motion reference signals over a future period, utilizing a multi-layer perceptron to generate the simulator’s motion reference control signal sequences for the same duration. Unlike the online optimization employed in MPC, this algorithm as an offline optimization method, providing substantial computational advantages when integrated into the driving simulator. As the prediction horizon increases, the algorithm demonstrates superior computational efficiency, which helps reduce the incidence of motion sickness during the use of the driving simulator. Third, a loss function specifically designed for the motion simulator is proposed. This function incorporates constraints derived from the MPC framework to address workspace limitations and applies them to workspace management. These constraints restrict the platform’s acceleration and speed near the workspace boundaries, allowing for better utilization of the available space. The algorithm is validated using Carla’s autonomous driving simulation software as the dataset generator. During the training process, the proposed algorithm in this paper achieves an order-of-magnitude improvement in convergence speed compared to conventional training methods of PPO and DDPG. Simulations with a 10-step prediction horizon indicate that the Root Mean Square Error (RMSE) produced by this algorithm is comparable to that of the MCA based on MPC (MPC-MCA) and significantly lower than that of the MCA based classical washout (CW-MCA). At higher prediction horizons, the algorithm achieves performance on par with state-of-the-art MPC-based motion cueing algorithms while exhibiting reduced algorithmic delay. Additionally, the proposed algorithm delivers quicker results and improved tracking performance across all prediction horizons, ultimately surpassing the current state-of-the-art MPC-MCA.https://ieeexplore.ieee.org/document/10978024/Motion cueing algorithmdriving simulatoranalytic policy gradient (APG)differentiable simulatordeep reinforcement learning
spellingShingle Xiaowei Huang
Xuhua Shi
Peiyao Wang
Hongzan Xu
Xiaojun Tang
Gaoran Zhang
An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators
IEEE Access
Motion cueing algorithm
driving simulator
analytic policy gradient (APG)
differentiable simulator
deep reinforcement learning
title An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators
title_full An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators
title_fullStr An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators
title_full_unstemmed An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators
title_short An Analytic Policy Gradient-Based Deep Reinforcement Learning Motion Cueing Algorithm for Driving Simulators
title_sort analytic policy gradient based deep reinforcement learning motion cueing algorithm for driving simulators
topic Motion cueing algorithm
driving simulator
analytic policy gradient (APG)
differentiable simulator
deep reinforcement learning
url https://ieeexplore.ieee.org/document/10978024/
work_keys_str_mv AT xiaoweihuang ananalyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT xuhuashi ananalyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT peiyaowang ananalyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT hongzanxu ananalyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT xiaojuntang ananalyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT gaoranzhang ananalyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT xiaoweihuang analyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT xuhuashi analyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT peiyaowang analyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT hongzanxu analyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT xiaojuntang analyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators
AT gaoranzhang analyticpolicygradientbaseddeepreinforcementlearningmotioncueingalgorithmfordrivingsimulators