Spectrum resource allocation for high-throughput satellite communications based on behavior cloning

In high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algori...

Full description

Saved in:
Bibliographic Details
Main Authors: QIN Hao, LI Shuangyi, ZHAO Di, MENG Haowei, SONG Bin
Format: Article
Language:zho
Published: Editorial Department of Journal on Communications 2024-05-01
Series:Tongxin xuebao
Subjects:
Online Access:http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024100/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841539220412301312
author QIN Hao
LI Shuangyi
ZHAO Di
MENG Haowei
SONG Bin
author_facet QIN Hao
LI Shuangyi
ZHAO Di
MENG Haowei
SONG Bin
author_sort QIN Hao
collection DOAJ
description In high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algorithm that combined behavior cloning (BC) with deep reinforcement learning (DRL) was proposed. In the first stage, the strategy network was pretrained using existing decision data from satellite operation through behavior cloning, which mimicked expert behavior to reduce blind exploration and accelerate algorithm convergence. In the second stage, the strategy network was further optimized using the proximal policy optimization (PPO), and a convolutional block attention module (CBAM) was employed to better extract the user traffic features, thereby enhancing overall algorithm performance. Simulation results demonstrate that the proposed algorithm outperforms the benchmark algorithms in terms of convergence speed and algorithm stability, and also delivers superior performance in system delay, average system satisfaction, and spectrum efficiency.
format Article
id doaj-art-9fda7c0e42ff40afbfc7c563349e2823
institution Kabale University
issn 1000-436X
language zho
publishDate 2024-05-01
publisher Editorial Department of Journal on Communications
record_format Article
series Tongxin xuebao
spelling doaj-art-9fda7c0e42ff40afbfc7c563349e28232025-01-14T07:24:26ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2024-05-014510111462276862Spectrum resource allocation for high-throughput satellite communications based on behavior cloningQIN HaoLI ShuangyiZHAO DiMENG HaoweiSONG BinIn high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algorithm that combined behavior cloning (BC) with deep reinforcement learning (DRL) was proposed. In the first stage, the strategy network was pretrained using existing decision data from satellite operation through behavior cloning, which mimicked expert behavior to reduce blind exploration and accelerate algorithm convergence. In the second stage, the strategy network was further optimized using the proximal policy optimization (PPO), and a convolutional block attention module (CBAM) was employed to better extract the user traffic features, thereby enhancing overall algorithm performance. Simulation results demonstrate that the proposed algorithm outperforms the benchmark algorithms in terms of convergence speed and algorithm stability, and also delivers superior performance in system delay, average system satisfaction, and spectrum efficiency.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024100/high-throughput satellitebehavior cloningdeep reinforcement learningproximal policy optimizationconvolutional block attention module
spellingShingle QIN Hao
LI Shuangyi
ZHAO Di
MENG Haowei
SONG Bin
Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
Tongxin xuebao
high-throughput satellite
behavior cloning
deep reinforcement learning
proximal policy optimization
convolutional block attention module
title Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
title_full Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
title_fullStr Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
title_full_unstemmed Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
title_short Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
title_sort spectrum resource allocation for high throughput satellite communications based on behavior cloning
topic high-throughput satellite
behavior cloning
deep reinforcement learning
proximal policy optimization
convolutional block attention module
url http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024100/
work_keys_str_mv AT qinhao spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning
AT lishuangyi spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning
AT zhaodi spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning
AT menghaowei spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning
AT songbin spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning