A Long-Term Target Search Method for Unmanned Aerial Vehicles Based on Reinforcement Learning

Unmanned aerial vehicles (UAVs) are increasingly being employed in search operations. Deep reinforcement learning (DRL), owing to its robust self-learning and adaptive capabilities, has been extensively applied to drone search tasks. However, traditional DRL approaches often suffer from long trainin...

Full description

Saved in:
Bibliographic Details
Main Authors: Dexing Wei, Lun Zhang, Mei Yang, Hanqiang Deng, Jian Huang
Format: Article
Language:English
Published: MDPI AG 2024-09-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/8/10/536
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850204949210726400
author Dexing Wei
Lun Zhang
Mei Yang
Hanqiang Deng
Jian Huang
author_facet Dexing Wei
Lun Zhang
Mei Yang
Hanqiang Deng
Jian Huang
author_sort Dexing Wei
collection DOAJ
description Unmanned aerial vehicles (UAVs) are increasingly being employed in search operations. Deep reinforcement learning (DRL), owing to its robust self-learning and adaptive capabilities, has been extensively applied to drone search tasks. However, traditional DRL approaches often suffer from long training times, especially in long-term search missions for UAVs, where the interaction cycles between the agent and the environment are extended. This paper addresses this critical issue by introducing a novel method—temporally asynchronous grouped environment reinforcement learning (TAGRL). Our key innovation lies in recognizing that as the number of training environments increases, agents can learn knowledge from discontinuous trajectories. This insight leads to the design of grouped environments, allowing agents to explore only a limited number of steps within each interaction cycle rather than completing full sequences. Consequently, TAGRL demonstrates faster learning speeds and lower memory consumption compared to existing parallel environment learning methods. The results indicate that this framework enhances the efficiency of UAV search tasks, paving the way for more scalable and effective applications of RL in complex scenarios.
format Article
id doaj-art-44aeccfd3a0b4f2f9730a9f35c48e588
institution OA Journals
issn 2504-446X
language English
publishDate 2024-09-01
publisher MDPI AG
record_format Article
series Drones
spelling doaj-art-44aeccfd3a0b4f2f9730a9f35c48e5882025-08-20T02:11:12ZengMDPI AGDrones2504-446X2024-09-0181053610.3390/drones8100536A Long-Term Target Search Method for Unmanned Aerial Vehicles Based on Reinforcement LearningDexing Wei0Lun Zhang1Mei Yang2Hanqiang Deng3Jian Huang4College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Intelligence Science and Technology, National University of Defense Technology, Changsha 410073, ChinaUnmanned aerial vehicles (UAVs) are increasingly being employed in search operations. Deep reinforcement learning (DRL), owing to its robust self-learning and adaptive capabilities, has been extensively applied to drone search tasks. However, traditional DRL approaches often suffer from long training times, especially in long-term search missions for UAVs, where the interaction cycles between the agent and the environment are extended. This paper addresses this critical issue by introducing a novel method—temporally asynchronous grouped environment reinforcement learning (TAGRL). Our key innovation lies in recognizing that as the number of training environments increases, agents can learn knowledge from discontinuous trajectories. This insight leads to the design of grouped environments, allowing agents to explore only a limited number of steps within each interaction cycle rather than completing full sequences. Consequently, TAGRL demonstrates faster learning speeds and lower memory consumption compared to existing parallel environment learning methods. The results indicate that this framework enhances the efficiency of UAV search tasks, paving the way for more scalable and effective applications of RL in complex scenarios.https://www.mdpi.com/2504-446X/8/10/536UAVsreinforcement learninglong-term search taskmulti-agent
spellingShingle Dexing Wei
Lun Zhang
Mei Yang
Hanqiang Deng
Jian Huang
A Long-Term Target Search Method for Unmanned Aerial Vehicles Based on Reinforcement Learning
Drones
UAVs
reinforcement learning
long-term search task
multi-agent
title A Long-Term Target Search Method for Unmanned Aerial Vehicles Based on Reinforcement Learning
title_full A Long-Term Target Search Method for Unmanned Aerial Vehicles Based on Reinforcement Learning
title_fullStr A Long-Term Target Search Method for Unmanned Aerial Vehicles Based on Reinforcement Learning
title_full_unstemmed A Long-Term Target Search Method for Unmanned Aerial Vehicles Based on Reinforcement Learning
title_short A Long-Term Target Search Method for Unmanned Aerial Vehicles Based on Reinforcement Learning
title_sort long term target search method for unmanned aerial vehicles based on reinforcement learning
topic UAVs
reinforcement learning
long-term search task
multi-agent
url https://www.mdpi.com/2504-446X/8/10/536
work_keys_str_mv AT dexingwei alongtermtargetsearchmethodforunmannedaerialvehiclesbasedonreinforcementlearning
AT lunzhang alongtermtargetsearchmethodforunmannedaerialvehiclesbasedonreinforcementlearning
AT meiyang alongtermtargetsearchmethodforunmannedaerialvehiclesbasedonreinforcementlearning
AT hanqiangdeng alongtermtargetsearchmethodforunmannedaerialvehiclesbasedonreinforcementlearning
AT jianhuang alongtermtargetsearchmethodforunmannedaerialvehiclesbasedonreinforcementlearning
AT dexingwei longtermtargetsearchmethodforunmannedaerialvehiclesbasedonreinforcementlearning
AT lunzhang longtermtargetsearchmethodforunmannedaerialvehiclesbasedonreinforcementlearning
AT meiyang longtermtargetsearchmethodforunmannedaerialvehiclesbasedonreinforcementlearning
AT hanqiangdeng longtermtargetsearchmethodforunmannedaerialvehiclesbasedonreinforcementlearning
AT jianhuang longtermtargetsearchmethodforunmannedaerialvehiclesbasedonreinforcementlearning