Adaptive Q-Learning Grey Wolf Optimizer for UAV Path Planning

Path planning is crucial for safely and efficiently navigating unmanned aerial vehicles (UAVs) toward operational goals. Often, this is a complex, multi-constraint, and non-linear optimization problem, and metaheuristic algorithms are frequently used to solve it. Grey Wolf Optimization (GWO) is one...

Full description

Saved in:
Bibliographic Details
Main Authors: Golam Moktader Nayeem, Mingyu Fan, Golam Moktader Daiyan
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Drones
Subjects:
Online Access:https://www.mdpi.com/2504-446X/9/4/246
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Path planning is crucial for safely and efficiently navigating unmanned aerial vehicles (UAVs) toward operational goals. Often, this is a complex, multi-constraint, and non-linear optimization problem, and metaheuristic algorithms are frequently used to solve it. Grey Wolf Optimization (GWO) is one of the most popular algorithms for solving such problems. However, standard GWO has several limitations, such as premature convergence, susceptibility to local minima, and unsuitability for dynamic environments due to its lack of adaptive learning. We propose a Q-learning-based GWO algorithm to address these issues in this study. QGWO introduces four key features: a Q-learning-based adaptive convergence factor, a segmented and parameterized position update strategy, a long-jump mechanism for population diversity preservation, and the replacement of non-dominant wolves for improved exploration. In addition, the Bayesian optimization algorithm is used to set parameters in QGWO for better performance. To evaluate the quality and robustness of QGWO, extensive numerical and simulation experiments were conducted on IEEE CEC 2022 benchmark functions, comparing it with standard GWO and some of its recent variants. In path planning simulation, QGWO lowers the path cost by 27.4%, improves the convergence speed by 19.06%, and reduces the area under the curve (AUC) by 23.8% over standard GWO, achieving optimal trajectory. Results show that QGWO is an efficient, reliable algorithm for UAV path planning in dynamic environments.
ISSN:2504-446X