Spectrum resource allocation for high-throughput satellite communications based on behavior cloning
In high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algori...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | zho |
Published: |
Editorial Department of Journal on Communications
2024-05-01
|
Series: | Tongxin xuebao |
Subjects: | |
Online Access: | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024100/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1841539220412301312 |
---|---|
author | QIN Hao LI Shuangyi ZHAO Di MENG Haowei SONG Bin |
author_facet | QIN Hao LI Shuangyi ZHAO Di MENG Haowei SONG Bin |
author_sort | QIN Hao |
collection | DOAJ |
description | In high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algorithm that combined behavior cloning (BC) with deep reinforcement learning (DRL) was proposed. In the first stage, the strategy network was pretrained using existing decision data from satellite operation through behavior cloning, which mimicked expert behavior to reduce blind exploration and accelerate algorithm convergence. In the second stage, the strategy network was further optimized using the proximal policy optimization (PPO), and a convolutional block attention module (CBAM) was employed to better extract the user traffic features, thereby enhancing overall algorithm performance. Simulation results demonstrate that the proposed algorithm outperforms the benchmark algorithms in terms of convergence speed and algorithm stability, and also delivers superior performance in system delay, average system satisfaction, and spectrum efficiency. |
format | Article |
id | doaj-art-9fda7c0e42ff40afbfc7c563349e2823 |
institution | Kabale University |
issn | 1000-436X |
language | zho |
publishDate | 2024-05-01 |
publisher | Editorial Department of Journal on Communications |
record_format | Article |
series | Tongxin xuebao |
spelling | doaj-art-9fda7c0e42ff40afbfc7c563349e28232025-01-14T07:24:26ZzhoEditorial Department of Journal on CommunicationsTongxin xuebao1000-436X2024-05-014510111462276862Spectrum resource allocation for high-throughput satellite communications based on behavior cloningQIN HaoLI ShuangyiZHAO DiMENG HaoweiSONG BinIn high-throughput multi-beam satellite systems, the dimensionality of the spectrum resource allocation problem increased drastically with the number of satellite beams and service users, which caused an exponential rise in the complexity of the solution. To address the challenge, a two-stage algorithm that combined behavior cloning (BC) with deep reinforcement learning (DRL) was proposed. In the first stage, the strategy network was pretrained using existing decision data from satellite operation through behavior cloning, which mimicked expert behavior to reduce blind exploration and accelerate algorithm convergence. In the second stage, the strategy network was further optimized using the proximal policy optimization (PPO), and a convolutional block attention module (CBAM) was employed to better extract the user traffic features, thereby enhancing overall algorithm performance. Simulation results demonstrate that the proposed algorithm outperforms the benchmark algorithms in terms of convergence speed and algorithm stability, and also delivers superior performance in system delay, average system satisfaction, and spectrum efficiency.http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024100/high-throughput satellitebehavior cloningdeep reinforcement learningproximal policy optimizationconvolutional block attention module |
spellingShingle | QIN Hao LI Shuangyi ZHAO Di MENG Haowei SONG Bin Spectrum resource allocation for high-throughput satellite communications based on behavior cloning Tongxin xuebao high-throughput satellite behavior cloning deep reinforcement learning proximal policy optimization convolutional block attention module |
title | Spectrum resource allocation for high-throughput satellite communications based on behavior cloning |
title_full | Spectrum resource allocation for high-throughput satellite communications based on behavior cloning |
title_fullStr | Spectrum resource allocation for high-throughput satellite communications based on behavior cloning |
title_full_unstemmed | Spectrum resource allocation for high-throughput satellite communications based on behavior cloning |
title_short | Spectrum resource allocation for high-throughput satellite communications based on behavior cloning |
title_sort | spectrum resource allocation for high throughput satellite communications based on behavior cloning |
topic | high-throughput satellite behavior cloning deep reinforcement learning proximal policy optimization convolutional block attention module |
url | http://www.joconline.com.cn/zh/article/doi/10.11959/j.issn.1000-436x.2024100/ |
work_keys_str_mv | AT qinhao spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning AT lishuangyi spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning AT zhaodi spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning AT menghaowei spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning AT songbin spectrumresourceallocationforhighthroughputsatellitecommunicationsbasedonbehaviorcloning |