Spectrum resource allocation for high-throughput satellite communications based on behavior cloning

In high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algori...

Full description

Saved in:

Bibliographic Details
Main Authors:	QIN Hao, LI Shuangyi, ZHAO Di, MENG Haowei, SONG Bin
Format:	Article
Language:	zho
Published:	Editorial Department of Journal on Communications 2024-05-01
Series:	Tongxin xuebao
Subjects:	high-throughput satellite behavior cloning deep reinforcement learning proximal policy optimization convolutional block attention module
Online Access:	http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024100/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1850124275872169984
author	QIN Hao LI Shuangyi ZHAO Di MENG Haowei SONG Bin
author_facet	QIN Hao LI Shuangyi ZHAO Di MENG Haowei SONG Bin
author_sort	QIN Hao
collection	DOAJ
description	In high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algorithm that combined behavior cloning (BC) with deep reinforcement learning (DRL) was proposed. In the first stage, the strategy network was pretrained using existing decision data from satellite operation through behavior cloning, which mimicked expert behavior to reduce blind exploration and accelerate algorithm convergence. In the second stage, the strategy network was further optimized using the proximal policy optimization (PPO), and a convolutional block attention module (CBAM) was employed to better extract the user traffic features, thereby enhancing overall algorithm performance. Simulation results demonstrate that the proposed algorithm outperforms the benchmark algorithms in terms of convergence speed and algorithm stability, and also delivers superior performance in system delay, average system satisfaction, and spectrum efficiency.
format	Article
id	doaj-art-9fda7c0e42ff40afbfc7c563349e2823
institution	OA Journals
issn	1000-436X
language	zho
publishDate	2024-05-01
publisher	Editorial Department of Journal on Communications
record_format	Article
series	Tongxin xuebao
spelling	doaj-art-9fda7c0e42ff40afbfc7c563349e28232025-08-20T02:34:20ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2024-05-014510111462276862Spectrum resource allocation for high-throughput satellite communications based on behavior cloningQIN HaoLI ShuangyiZHAO DiMENG HaoweiSONG BinIn high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algorithm that combined behavior cloning (BC) with deep reinforcement learning (DRL) was proposed. In the first stage, the strategy network was pretrained using existing decision data from satellite operation through behavior cloning, which mimicked expert behavior to reduce blind exploration and accelerate algorithm convergence. In the second stage, the strategy network was further optimized using the proximal policy optimization (PPO), and a convolutional block attention module (CBAM) was employed to better extract the user traffic features, thereby enhancing overall algorithm performance. Simulation results demonstrate that the proposed algorithm outperforms the benchmark algorithms in terms of convergence speed and algorithm stability, and also delivers superior performance in system delay, average system satisfaction, and spectrum efficiency.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024100/high-throughput satellitebehavior cloningdeep reinforcement learningproximal policy optimizationconvolutional block attention module
spellingShingle	QIN Hao LI Shuangyi ZHAO Di MENG Haowei SONG Bin Spectrum resource allocation for high-throughput satellite communications based on behavior cloning Tongxin xuebao high-throughput satellite behavior cloning deep reinforcement learning proximal policy optimization convolutional block attention module
title	Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
title_full	Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
title_fullStr	Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
title_full_unstemmed	Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
title_short	Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
title_sort	spectrum resource allocation for high throughput satellite communications based on behavior cloning
topic	high-throughput satellite behavior cloning deep reinforcement learning proximal policy optimization convolutional block attention module
url	http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024100/
work_keys_str_mv	AT qinhao spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning AT lishuangyi spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning AT zhaodi spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning AT menghaowei spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning AT songbin spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning

Spectrum resource allocation for high-throughput satellite communications based on behavior cloning

Similar Items