Reward-optimizing learning using stochastic release plasticity
Synaptic plasticity underlies adaptive learning in neural systems, offering a biologically plausible framework for reward-driven learning. However, a question remains: how can plasticity rules achieve robustness and effectiveness comparable to error backpropagation? In this study, we introduce Rewar...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-08-01
|
| Series: | Frontiers in Neural Circuits |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fncir.2025.1618506/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849405460929576960 |
|---|---|
| author | Yuhao Sun Yuhao Sun Wantong Liao Wantong Liao Jinhao Li Jinhao Li Xinche Zhang Xinche Zhang Guan Wang Guan Wang Zhiyuan Ma Zhiyuan Ma Zhiyuan Ma Sen Song Sen Song |
| author_facet | Yuhao Sun Yuhao Sun Wantong Liao Wantong Liao Jinhao Li Jinhao Li Xinche Zhang Xinche Zhang Guan Wang Guan Wang Zhiyuan Ma Zhiyuan Ma Zhiyuan Ma Sen Song Sen Song |
| author_sort | Yuhao Sun |
| collection | DOAJ |
| description | Synaptic plasticity underlies adaptive learning in neural systems, offering a biologically plausible framework for reward-driven learning. However, a question remains: how can plasticity rules achieve robustness and effectiveness comparable to error backpropagation? In this study, we introduce Reward-Optimized Stochastic Release Plasticity (RSRP), a learning framework where synaptic release is modeled as a parameterized distribution. Utilizing natural gradient estimation, we derive a synaptic plasticity learning rule that effectively adapts to maximize reward signals. Our approach achieves competitive performance and demonstrates stability in reinforcement learning, comparable to Proximal Policy Optimization (PPO), while attaining accuracy comparable with error backpropagation in digit classification. Additionally, we identify reward regularization as a key stabilizing mechanism and validate our method in biologically plausible networks. Our findings suggest that RSRP offers a robust and effective plasticity learning rule, especially in a discontinuous reinforcement learning paradigm, with potential implications for both artificial intelligence and experimental neuroscience. |
| format | Article |
| id | doaj-art-3cdfb8ac527e4c7a837d0d01d87154c5 |
| institution | Kabale University |
| issn | 1662-5110 |
| language | English |
| publishDate | 2025-08-01 |
| publisher | Frontiers Media S.A. |
| record_format | Article |
| series | Frontiers in Neural Circuits |
| spelling | doaj-art-3cdfb8ac527e4c7a837d0d01d87154c52025-08-20T03:36:41ZengFrontiers Media S.A.Frontiers in Neural Circuits1662-51102025-08-011910.3389/fncir.2025.16185061618506Reward-optimizing learning using stochastic release plasticityYuhao Sun0Yuhao Sun1Wantong Liao2Wantong Liao3Jinhao Li4Jinhao Li5Xinche Zhang6Xinche Zhang7Guan Wang8Guan Wang9Zhiyuan Ma10Zhiyuan Ma11Zhiyuan Ma12Sen Song13Sen Song14Laboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Biomedical Engineering, Tsinghua University, Beijing, ChinaLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Biomedical Engineering, Tsinghua University, Beijing, ChinaLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Basic Medical Sciences, Tsinghua University, Beijing, ChinaLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Biomedical Engineering, Tsinghua University, Beijing, ChinaLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSapient Intelligence, Singapore, SingaporeLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Biomedical Engineering, Tsinghua University, Beijing, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Biomedical Engineering, Tsinghua University, Beijing, ChinaSynaptic plasticity underlies adaptive learning in neural systems, offering a biologically plausible framework for reward-driven learning. However, a question remains: how can plasticity rules achieve robustness and effectiveness comparable to error backpropagation? In this study, we introduce Reward-Optimized Stochastic Release Plasticity (RSRP), a learning framework where synaptic release is modeled as a parameterized distribution. Utilizing natural gradient estimation, we derive a synaptic plasticity learning rule that effectively adapts to maximize reward signals. Our approach achieves competitive performance and demonstrates stability in reinforcement learning, comparable to Proximal Policy Optimization (PPO), while attaining accuracy comparable with error backpropagation in digit classification. Additionally, we identify reward regularization as a key stabilizing mechanism and validate our method in biologically plausible networks. Our findings suggest that RSRP offers a robust and effective plasticity learning rule, especially in a discontinuous reinforcement learning paradigm, with potential implications for both artificial intelligence and experimental neuroscience.https://www.frontiersin.org/articles/10.3389/fncir.2025.1618506/fullsynaptic plasticitybrain inspired computingreinforcement learningSpiking Neural Networksupervised learning |
| spellingShingle | Yuhao Sun Yuhao Sun Wantong Liao Wantong Liao Jinhao Li Jinhao Li Xinche Zhang Xinche Zhang Guan Wang Guan Wang Zhiyuan Ma Zhiyuan Ma Zhiyuan Ma Sen Song Sen Song Reward-optimizing learning using stochastic release plasticity Frontiers in Neural Circuits synaptic plasticity brain inspired computing reinforcement learning Spiking Neural Network supervised learning |
| title | Reward-optimizing learning using stochastic release plasticity |
| title_full | Reward-optimizing learning using stochastic release plasticity |
| title_fullStr | Reward-optimizing learning using stochastic release plasticity |
| title_full_unstemmed | Reward-optimizing learning using stochastic release plasticity |
| title_short | Reward-optimizing learning using stochastic release plasticity |
| title_sort | reward optimizing learning using stochastic release plasticity |
| topic | synaptic plasticity brain inspired computing reinforcement learning Spiking Neural Network supervised learning |
| url | https://www.frontiersin.org/articles/10.3389/fncir.2025.1618506/full |
| work_keys_str_mv | AT yuhaosun rewardoptimizinglearningusingstochasticreleaseplasticity AT yuhaosun rewardoptimizinglearningusingstochasticreleaseplasticity AT wantongliao rewardoptimizinglearningusingstochasticreleaseplasticity AT wantongliao rewardoptimizinglearningusingstochasticreleaseplasticity AT jinhaoli rewardoptimizinglearningusingstochasticreleaseplasticity AT jinhaoli rewardoptimizinglearningusingstochasticreleaseplasticity AT xinchezhang rewardoptimizinglearningusingstochasticreleaseplasticity AT xinchezhang rewardoptimizinglearningusingstochasticreleaseplasticity AT guanwang rewardoptimizinglearningusingstochasticreleaseplasticity AT guanwang rewardoptimizinglearningusingstochasticreleaseplasticity AT zhiyuanma rewardoptimizinglearningusingstochasticreleaseplasticity AT zhiyuanma rewardoptimizinglearningusingstochasticreleaseplasticity AT zhiyuanma rewardoptimizinglearningusingstochasticreleaseplasticity AT sensong rewardoptimizinglearningusingstochasticreleaseplasticity AT sensong rewardoptimizinglearningusingstochasticreleaseplasticity |