Multi-Agent Deep Reinforcement Learning Cooperative Control Model for Autonomous Vehicle Merging into Platoon in Highway

This study presents the first investigation into the problem of autonomous vehicle (AV) merging into existing platoons, proposing a multi-agent deep reinforcement learning (MA-DRL)-based cooperative control framework. The developed MA-DRL architecture enables coordinated learning among multiple auto...

Full description

Saved in:

Bibliographic Details
Main Authors:	Jiajia Chen, Bingqing Zhu, Mengyu Zhang, Xiang Ling, Xiaobo Ruan, Yifan Deng, Ning Guo
Format:	Article
Language:	English
Published:	MDPI AG 2025-04-01
Series:	World Electric Vehicle Journal
Subjects:	autonomous vehicle platooning control deep reinforcement learning multi-agent systems
Online Access:	https://www.mdpi.com/2032-6653/16/4/225
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849713726275452928
author	Jiajia Chen Bingqing Zhu Mengyu Zhang Xiang Ling Xiaobo Ruan Yifan Deng Ning Guo
author_facet	Jiajia Chen Bingqing Zhu Mengyu Zhang Xiang Ling Xiaobo Ruan Yifan Deng Ning Guo
author_sort	Jiajia Chen
collection	DOAJ
description	This study presents the first investigation into the problem of autonomous vehicle (AV) merging into existing platoons, proposing a multi-agent deep reinforcement learning (MA-DRL)-based cooperative control framework. The developed MA-DRL architecture enables coordinated learning among multiple autonomous agents to address the multi-objective coordination challenge through synchronized control of platoon longitudinal acceleration, AV steering and acceleration. To enhance training efficiency, we develop a dual-layer multi-agent maximum Q-value proximal policy optimization (MAMQPPO) method, which extends the multi-agent PPO algorithm (a policy gradient method ensuring stable policy updates) by incorporating maximum Q-value action selection for platoon gap control and discrete command generation. This method simplifies the training process by using maximum Q-value action policy optimization to learn platoon gap selection and discrete action commands. Furthermore, a partially decoupled reward function (PD-Reward) is designed to properly guide the behavioral actions of both AVs and platoons while accelerating network convergence. Comprehensive highway simulation experiments show the proposed method reduces merging time by 37.69% (12.4 s vs. 19.9 s) and energy consumption by 58% (3.56 kWh vs. 8.47 kWh) compared to existing methods (the quintic polynomial-based + PID (Proportional–Integral–Differential)).
format	Article
id	doaj-art-c45e6456bbc9497ca0bd8d90ea666af6
institution	DOAJ
issn	2032-6653
language	English
publishDate	2025-04-01
publisher	MDPI AG
record_format	Article
series	World Electric Vehicle Journal
spelling	doaj-art-c45e6456bbc9497ca0bd8d90ea666af62025-08-20T03:13:54ZengMDPI AGWorld Electric Vehicle Journal2032-66532025-04-0116422510.3390/wevj16040225Multi-Agent Deep Reinforcement Learning Cooperative Control Model for Autonomous Vehicle Merging into Platoon in HighwayJiajia Chen0Bingqing Zhu1Mengyu Zhang2Xiang Ling3Xiaobo Ruan4Yifan Deng5Ning Guo6School of Automotive and Transportation Engineering, Hefei University of Technology, Hefei 230009, ChinaSchool of Automotive and Transportation Engineering, Hefei University of Technology, Hefei 230009, ChinaHefei Communication Investment Holding Group Co., Ltd., Hefei 230009, ChinaSchool of Automotive and Transportation Engineering, Hefei University of Technology, Hefei 230009, ChinaSchool of Automotive and Transportation Engineering, Hefei University of Technology, Hefei 230009, ChinaSchool of Chang’an-Dublin International College of Transportation, Chang’an University, Xi’an 710064, ChinaSchool of Automotive and Transportation Engineering, Hefei University of Technology, Hefei 230009, ChinaThis study presents the first investigation into the problem of autonomous vehicle (AV) merging into existing platoons, proposing a multi-agent deep reinforcement learning (MA-DRL)-based cooperative control framework. The developed MA-DRL architecture enables coordinated learning among multiple autonomous agents to address the multi-objective coordination challenge through synchronized control of platoon longitudinal acceleration, AV steering and acceleration. To enhance training efficiency, we develop a dual-layer multi-agent maximum Q-value proximal policy optimization (MAMQPPO) method, which extends the multi-agent PPO algorithm (a policy gradient method ensuring stable policy updates) by incorporating maximum Q-value action selection for platoon gap control and discrete command generation. This method simplifies the training process by using maximum Q-value action policy optimization to learn platoon gap selection and discrete action commands. Furthermore, a partially decoupled reward function (PD-Reward) is designed to properly guide the behavioral actions of both AVs and platoons while accelerating network convergence. Comprehensive highway simulation experiments show the proposed method reduces merging time by 37.69% (12.4 s vs. 19.9 s) and energy consumption by 58% (3.56 kWh vs. 8.47 kWh) compared to existing methods (the quintic polynomial-based + PID (Proportional–Integral–Differential)).https://www.mdpi.com/2032-6653/16/4/225autonomous vehicleplatooning controldeep reinforcement learningmulti-agent systems
spellingShingle	Jiajia Chen Bingqing Zhu Mengyu Zhang Xiang Ling Xiaobo Ruan Yifan Deng Ning Guo Multi-Agent Deep Reinforcement Learning Cooperative Control Model for Autonomous Vehicle Merging into Platoon in Highway World Electric Vehicle Journal autonomous vehicle platooning control deep reinforcement learning multi-agent systems
title	Multi-Agent Deep Reinforcement Learning Cooperative Control Model for Autonomous Vehicle Merging into Platoon in Highway
title_full	Multi-Agent Deep Reinforcement Learning Cooperative Control Model for Autonomous Vehicle Merging into Platoon in Highway
title_fullStr	Multi-Agent Deep Reinforcement Learning Cooperative Control Model for Autonomous Vehicle Merging into Platoon in Highway
title_full_unstemmed	Multi-Agent Deep Reinforcement Learning Cooperative Control Model for Autonomous Vehicle Merging into Platoon in Highway
title_short	Multi-Agent Deep Reinforcement Learning Cooperative Control Model for Autonomous Vehicle Merging into Platoon in Highway
title_sort	multi agent deep reinforcement learning cooperative control model for autonomous vehicle merging into platoon in highway
topic	autonomous vehicle platooning control deep reinforcement learning multi-agent systems
url	https://www.mdpi.com/2032-6653/16/4/225
work_keys_str_mv	AT jiajiachen multiagentdeepreinforcementlearningcooperativecontrolmodelforautonomousvehiclemergingintoplatooninhighway AT bingqingzhu multiagentdeepreinforcementlearningcooperativecontrolmodelforautonomousvehiclemergingintoplatooninhighway AT mengyuzhang multiagentdeepreinforcementlearningcooperativecontrolmodelforautonomousvehiclemergingintoplatooninhighway AT xiangling multiagentdeepreinforcementlearningcooperativecontrolmodelforautonomousvehiclemergingintoplatooninhighway AT xiaoboruan multiagentdeepreinforcementlearningcooperativecontrolmodelforautonomousvehiclemergingintoplatooninhighway AT yifandeng multiagentdeepreinforcementlearningcooperativecontrolmodelforautonomousvehiclemergingintoplatooninhighway AT ningguo multiagentdeepreinforcementlearningcooperativecontrolmodelforautonomousvehiclemergingintoplatooninhighway

Multi-Agent Deep Reinforcement Learning Cooperative Control Model for Autonomous Vehicle Merging into Platoon in Highway

Similar Items