Dual-Priority Delayed Deep Double Q-Network (DPD3QN): A Dueling Double Deep Q-Network with Dual-Priority Experience Replay for Autonomous Driving Behavior Decision-Making

The behavior decision control of autonomous vehicles is a critical aspect of advancing autonomous driving technology. However, current behavior decision algorithms based on deep reinforcement learning still face several challenges, such as insufficient safety and sparse reward mechanisms. To solve t...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shuai Li, Peicheng Shi, Aixi Yang, Heng Qi, Xinlong Dong
Format:	Article
Language:	English
Published:	MDPI AG 2025-05-01
Series:	Algorithms
Subjects:	autonomous driving behavioral decision-making deep reinforcement learning image dual-priority experience replay D3QN
Online Access:	https://www.mdpi.com/1999-4893/18/5/291
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	The behavior decision control of autonomous vehicles is a critical aspect of advancing autonomous driving technology. However, current behavior decision algorithms based on deep reinforcement learning still face several challenges, such as insufficient safety and sparse reward mechanisms. To solve these problems, this paper proposes a dueling double deep Q-network based on dual-priority experience replay—DPD3QN. Initially, the dueling network is integrated with the double deep Q-network, and the original network’s output layer is restructured to enhance the precision of action value estimation. Subsequently, dual-priority experience replay is incorporated to facilitate the model’s ability to swiftly recognize and leverage critical experiences. Ultimately, the training and evaluation are conducted on the OpenAI Gym simulation platform. The test results show that DPD3QN helps to improve the convergence speed of driverless vehicle behavior decision-making. Compared with the currently popular DQN and DDQN algorithms, this algorithm achieves higher success rates in challenging scenarios. Test scenario I increases by 11.8 and 25.8 percentage points, respectively, while the success rates in test scenarios I and II rise by 8.8 and 22.2 percentage points, respectively, indicating a more secure and efficient autonomous driving decision-making capability.
ISSN:	1999-4893

Dual-Priority Delayed Deep Double Q-Network (DPD3QN): A Dueling Double Deep Q-Network with Dual-Priority Experience Replay for Autonomous Driving Behavior Decision-Making

Similar Items