Reward-optimizing learning using stochastic release plasticity

Synaptic plasticity underlies adaptive learning in neural systems, offering a biologically plausible framework for reward-driven learning. However, a question remains: how can plasticity rules achieve robustness and effectiveness comparable to error backpropagation? In this study, we introduce Rewar...

Full description

Saved in:
Bibliographic Details
Main Authors: Yuhao Sun, Wantong Liao, Jinhao Li, Xinche Zhang, Guan Wang, Zhiyuan Ma, Sen Song
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-08-01
Series:Frontiers in Neural Circuits
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fncir.2025.1618506/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849405460929576960
author Yuhao Sun
Yuhao Sun
Wantong Liao
Wantong Liao
Jinhao Li
Jinhao Li
Xinche Zhang
Xinche Zhang
Guan Wang
Guan Wang
Zhiyuan Ma
Zhiyuan Ma
Zhiyuan Ma
Sen Song
Sen Song
author_facet Yuhao Sun
Yuhao Sun
Wantong Liao
Wantong Liao
Jinhao Li
Jinhao Li
Xinche Zhang
Xinche Zhang
Guan Wang
Guan Wang
Zhiyuan Ma
Zhiyuan Ma
Zhiyuan Ma
Sen Song
Sen Song
author_sort Yuhao Sun
collection DOAJ
description Synaptic plasticity underlies adaptive learning in neural systems, offering a biologically plausible framework for reward-driven learning. However, a question remains: how can plasticity rules achieve robustness and effectiveness comparable to error backpropagation? In this study, we introduce Reward-Optimized Stochastic Release Plasticity (RSRP), a learning framework where synaptic release is modeled as a parameterized distribution. Utilizing natural gradient estimation, we derive a synaptic plasticity learning rule that effectively adapts to maximize reward signals. Our approach achieves competitive performance and demonstrates stability in reinforcement learning, comparable to Proximal Policy Optimization (PPO), while attaining accuracy comparable with error backpropagation in digit classification. Additionally, we identify reward regularization as a key stabilizing mechanism and validate our method in biologically plausible networks. Our findings suggest that RSRP offers a robust and effective plasticity learning rule, especially in a discontinuous reinforcement learning paradigm, with potential implications for both artificial intelligence and experimental neuroscience.
format Article
id doaj-art-3cdfb8ac527e4c7a837d0d01d87154c5
institution Kabale University
issn 1662-5110
language English
publishDate 2025-08-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Neural Circuits
spelling doaj-art-3cdfb8ac527e4c7a837d0d01d87154c52025-08-20T03:36:41ZengFrontiers Media S.A.Frontiers in Neural Circuits1662-51102025-08-011910.3389/fncir.2025.16185061618506Reward-optimizing learning using stochastic release plasticityYuhao Sun0Yuhao Sun1Wantong Liao2Wantong Liao3Jinhao Li4Jinhao Li5Xinche Zhang6Xinche Zhang7Guan Wang8Guan Wang9Zhiyuan Ma10Zhiyuan Ma11Zhiyuan Ma12Sen Song13Sen Song14Laboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Biomedical Engineering, Tsinghua University, Beijing, ChinaLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Biomedical Engineering, Tsinghua University, Beijing, ChinaLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Basic Medical Sciences, Tsinghua University, Beijing, ChinaLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Biomedical Engineering, Tsinghua University, Beijing, ChinaLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSapient Intelligence, Singapore, SingaporeLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Biomedical Engineering, Tsinghua University, Beijing, ChinaCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaLaboratory of Brain and Intelligence, Tsinghua University, Beijing, ChinaSchool of Biomedical Engineering, Tsinghua University, Beijing, ChinaSynaptic plasticity underlies adaptive learning in neural systems, offering a biologically plausible framework for reward-driven learning. However, a question remains: how can plasticity rules achieve robustness and effectiveness comparable to error backpropagation? In this study, we introduce Reward-Optimized Stochastic Release Plasticity (RSRP), a learning framework where synaptic release is modeled as a parameterized distribution. Utilizing natural gradient estimation, we derive a synaptic plasticity learning rule that effectively adapts to maximize reward signals. Our approach achieves competitive performance and demonstrates stability in reinforcement learning, comparable to Proximal Policy Optimization (PPO), while attaining accuracy comparable with error backpropagation in digit classification. Additionally, we identify reward regularization as a key stabilizing mechanism and validate our method in biologically plausible networks. Our findings suggest that RSRP offers a robust and effective plasticity learning rule, especially in a discontinuous reinforcement learning paradigm, with potential implications for both artificial intelligence and experimental neuroscience.https://www.frontiersin.org/articles/10.3389/fncir.2025.1618506/fullsynaptic plasticitybrain inspired computingreinforcement learningSpiking Neural Networksupervised learning
spellingShingle Yuhao Sun
Yuhao Sun
Wantong Liao
Wantong Liao
Jinhao Li
Jinhao Li
Xinche Zhang
Xinche Zhang
Guan Wang
Guan Wang
Zhiyuan Ma
Zhiyuan Ma
Zhiyuan Ma
Sen Song
Sen Song
Reward-optimizing learning using stochastic release plasticity
Frontiers in Neural Circuits
synaptic plasticity
brain inspired computing
reinforcement learning
Spiking Neural Network
supervised learning
title Reward-optimizing learning using stochastic release plasticity
title_full Reward-optimizing learning using stochastic release plasticity
title_fullStr Reward-optimizing learning using stochastic release plasticity
title_full_unstemmed Reward-optimizing learning using stochastic release plasticity
title_short Reward-optimizing learning using stochastic release plasticity
title_sort reward optimizing learning using stochastic release plasticity
topic synaptic plasticity
brain inspired computing
reinforcement learning
Spiking Neural Network
supervised learning
url https://www.frontiersin.org/articles/10.3389/fncir.2025.1618506/full
work_keys_str_mv AT yuhaosun rewardoptimizinglearningusingstochasticreleaseplasticity
AT yuhaosun rewardoptimizinglearningusingstochasticreleaseplasticity
AT wantongliao rewardoptimizinglearningusingstochasticreleaseplasticity
AT wantongliao rewardoptimizinglearningusingstochasticreleaseplasticity
AT jinhaoli rewardoptimizinglearningusingstochasticreleaseplasticity
AT jinhaoli rewardoptimizinglearningusingstochasticreleaseplasticity
AT xinchezhang rewardoptimizinglearningusingstochasticreleaseplasticity
AT xinchezhang rewardoptimizinglearningusingstochasticreleaseplasticity
AT guanwang rewardoptimizinglearningusingstochasticreleaseplasticity
AT guanwang rewardoptimizinglearningusingstochasticreleaseplasticity
AT zhiyuanma rewardoptimizinglearningusingstochasticreleaseplasticity
AT zhiyuanma rewardoptimizinglearningusingstochasticreleaseplasticity
AT zhiyuanma rewardoptimizinglearningusingstochasticreleaseplasticity
AT sensong rewardoptimizinglearningusingstochasticreleaseplasticity
AT sensong rewardoptimizinglearningusingstochasticreleaseplasticity