Novel off-policy reinforcement learning framework for relay-assisted D2D network powered by ambient backscattering and energy harvesting
In this work, a novel off-policy reinforcement learning framework is developed for a relay-assisted cooperative device-to-device (D2D) communication system powered by ambient backscattering system (ABCS) and energy harvesting. The relay follows the harvest-receive-sense-then-transmit (HRSTT) strateg...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-09-01
|
| Series: | Results in Engineering |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2590123025019462 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849426139354759168 |
|---|---|
| author | Saranya Karattupalayam Chidambaram Sudhanshu Arya Yogesh Kumar Choukiker Abhijit Bhowmick |
| author_facet | Saranya Karattupalayam Chidambaram Sudhanshu Arya Yogesh Kumar Choukiker Abhijit Bhowmick |
| author_sort | Saranya Karattupalayam Chidambaram |
| collection | DOAJ |
| description | In this work, a novel off-policy reinforcement learning framework is developed for a relay-assisted cooperative device-to-device (D2D) communication system powered by ambient backscattering system (ABCS) and energy harvesting. The relay follows the harvest-receive-sense-then-transmit (HRSTT) strategy. To this end, a relay architecture is proposed, and an end-to-end frame structure is presented in this context. In the proposed architecture, the activity of a cellular user (CU) is sensed by the relay, which then selects between two mutually exclusive transmission modes: ambient backscatter mode (ABSM) and active RF transceiver mode (ARTRM) based on sensing information. An analytical framework is developed to capture the impact of random dwelling time of CU in a given time frame. In particular, to ensure reliable and effective communication performance, a novel off-policy machine learning (ML) framework ML-aided module for sensing and power allocation (MLAMSPA) has also been developed. Inspired from Q-learning, MLAMSPA is optimal for real-time sensing and power allocation for the users’ signals. The proposed MLAMSPA framework not only captures the dynamic time allocation between backscattering and direct transmission but also embeds the effects of dynamic channel conditions and power control into the reward signal used for learning the optimal policy. The network performance is studied in terms of CU dwelling time distribution, power allocation to the user signal, CU activity, sum-rate and outage. The impact of various network parameters, such as the mean of CU dwelling time, SNR threshold, harvesting time, backscatter parameters, etc., on sum-rate and outage is analysed. |
| format | Article |
| id | doaj-art-4b3965eee95b40f38de42bb51df22c6f |
| institution | Kabale University |
| issn | 2590-1230 |
| language | English |
| publishDate | 2025-09-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Results in Engineering |
| spelling | doaj-art-4b3965eee95b40f38de42bb51df22c6f2025-08-20T03:29:32ZengElsevierResults in Engineering2590-12302025-09-012710587510.1016/j.rineng.2025.105875Novel off-policy reinforcement learning framework for relay-assisted D2D network powered by ambient backscattering and energy harvestingSaranya Karattupalayam Chidambaram0Sudhanshu Arya1Yogesh Kumar Choukiker2Abhijit Bhowmick3School of Electronics Engineering, VIT Vellore, Tamil Nadu, IndiaSchool of Electronics Engineering, VIT Vellore, Tamil Nadu, IndiaSchool of Electronics Engineering, VIT Vellore, Tamil Nadu, IndiaCorresponding author.; School of Electronics Engineering, VIT Vellore, Tamil Nadu, IndiaIn this work, a novel off-policy reinforcement learning framework is developed for a relay-assisted cooperative device-to-device (D2D) communication system powered by ambient backscattering system (ABCS) and energy harvesting. The relay follows the harvest-receive-sense-then-transmit (HRSTT) strategy. To this end, a relay architecture is proposed, and an end-to-end frame structure is presented in this context. In the proposed architecture, the activity of a cellular user (CU) is sensed by the relay, which then selects between two mutually exclusive transmission modes: ambient backscatter mode (ABSM) and active RF transceiver mode (ARTRM) based on sensing information. An analytical framework is developed to capture the impact of random dwelling time of CU in a given time frame. In particular, to ensure reliable and effective communication performance, a novel off-policy machine learning (ML) framework ML-aided module for sensing and power allocation (MLAMSPA) has also been developed. Inspired from Q-learning, MLAMSPA is optimal for real-time sensing and power allocation for the users’ signals. The proposed MLAMSPA framework not only captures the dynamic time allocation between backscattering and direct transmission but also embeds the effects of dynamic channel conditions and power control into the reward signal used for learning the optimal policy. The network performance is studied in terms of CU dwelling time distribution, power allocation to the user signal, CU activity, sum-rate and outage. The impact of various network parameters, such as the mean of CU dwelling time, SNR threshold, harvesting time, backscatter parameters, etc., on sum-rate and outage is analysed.http://www.sciencedirect.com/science/article/pii/S2590123025019462BackscatteringDevice-to-device (D2D) communicationEnergy harvestingmachine learningsum-rate |
| spellingShingle | Saranya Karattupalayam Chidambaram Sudhanshu Arya Yogesh Kumar Choukiker Abhijit Bhowmick Novel off-policy reinforcement learning framework for relay-assisted D2D network powered by ambient backscattering and energy harvesting Results in Engineering Backscattering Device-to-device (D2D) communication Energy harvesting machine learning sum-rate |
| title | Novel off-policy reinforcement learning framework for relay-assisted D2D network powered by ambient backscattering and energy harvesting |
| title_full | Novel off-policy reinforcement learning framework for relay-assisted D2D network powered by ambient backscattering and energy harvesting |
| title_fullStr | Novel off-policy reinforcement learning framework for relay-assisted D2D network powered by ambient backscattering and energy harvesting |
| title_full_unstemmed | Novel off-policy reinforcement learning framework for relay-assisted D2D network powered by ambient backscattering and energy harvesting |
| title_short | Novel off-policy reinforcement learning framework for relay-assisted D2D network powered by ambient backscattering and energy harvesting |
| title_sort | novel off policy reinforcement learning framework for relay assisted d2d network powered by ambient backscattering and energy harvesting |
| topic | Backscattering Device-to-device (D2D) communication Energy harvesting machine learning sum-rate |
| url | http://www.sciencedirect.com/science/article/pii/S2590123025019462 |
| work_keys_str_mv | AT saranyakarattupalayamchidambaram noveloffpolicyreinforcementlearningframeworkforrelayassistedd2dnetworkpoweredbyambientbackscatteringandenergyharvesting AT sudhanshuarya noveloffpolicyreinforcementlearningframeworkforrelayassistedd2dnetworkpoweredbyambientbackscatteringandenergyharvesting AT yogeshkumarchoukiker noveloffpolicyreinforcementlearningframeworkforrelayassistedd2dnetworkpoweredbyambientbackscatteringandenergyharvesting AT abhijitbhowmick noveloffpolicyreinforcementlearningframeworkforrelayassistedd2dnetworkpoweredbyambientbackscatteringandenergyharvesting |