Energy-Efficient Trajectory Planning With Joint Device Selection and Power Splitting for mmWaves-Enabled UAV-NOMA Networks

This paper proposes two energy-efficient reinforcement learning (RL)-based algorithms for millimeter wave (mmWave)-enabled unmanned aerial vehicle (UAV) communications toward beyond-5G (B5G). This can be especially useful in ad-hoc communication scenarios within a neighborhood with main-network conn...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahmad Gendia, Osamu Muta, Sherief Hashima, Kohei Hatano
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Transactions on Machine Learning in Communications and Networking
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10517756/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper proposes two energy-efficient reinforcement learning (RL)-based algorithms for millimeter wave (mmWave)-enabled unmanned aerial vehicle (UAV) communications toward beyond-5G (B5G). This can be especially useful in ad-hoc communication scenarios within a neighborhood with main-network connectivity problems such as in areas affected by natural disasters. To improve the system’s overall sum-rate performance, the UAV-operated mobile base station (UAV-MBS) can harness non-orthogonal multiple access (NOMA) as an efficient protocol to grant ground devices access to fast downlink connections. Dynamic selection of suitable hovering spots within the target zone where the battery-constrained UAV needs to be positioned as well as calibrated NOMA power control with proper device pairing are critical for optimized performance. We propose cost-subsidized multiarmed bandit (CS-MAB) and double deep Q-network (DDQN)-based solutions to jointly address the problems of dynamic UAV path design, device pairing, and power splitting for downlink data transmission in NOMA-based systems. To verify that the proposed RL-based solutions support high sum-rates, numerical simulations are presented. In addition, exhaustive and random search benchmarks are provided as baselines for the achievable upper and lower sum-rate levels, respectively. The proposed DDQN agent achieves 96% of the sum-rate provided by the optimal exhaustive scanning whereas CS-MAB reaches 91.5%. By contrast, a conventional channel state sorting pairing (CSSP) solver achieves about 89.3%.
ISSN:2831-316X