A tutorial review of policy iteration methods in reinforcement learning for nonlinear optimal control
Reinforcement learning (RL) has been a powerful framework for designing optimal controllers for nonlinear systems. This tutorial review provides a comprehensive exploration of RL techniques, with a particular focus on policy iteration methods for the development of optimal controllers. We discuss ke...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Elsevier
2025-06-01
|
| Series: | Digital Chemical Engineering |
| Subjects: | |
| Online Access: | http://www.sciencedirect.com/science/article/pii/S2772508125000158 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850262418990563328 |
|---|---|
| author | Yujia Wang Xinji Zhu Zhe Wu |
| author_facet | Yujia Wang Xinji Zhu Zhe Wu |
| author_sort | Yujia Wang |
| collection | DOAJ |
| description | Reinforcement learning (RL) has been a powerful framework for designing optimal controllers for nonlinear systems. This tutorial review provides a comprehensive exploration of RL techniques, with a particular focus on policy iteration methods for the development of optimal controllers. We discuss key theoretical aspects, including closed-loop stability and convergence analysis of learning algorithms. Additionally, the review addresses practical challenges encountered in real-world applications, such as the development of accurate process models, incorporating safety guarantees during learning, leveraging physics-informed machine learning and transfer learning techniques to overcome learning difficulties, managing model uncertainties, and enabling scalability through distributed RL. To demonstrate the effectiveness of these approaches, a simulation example of a chemical reactor is presented, with open-source code made available on GitHub. The review concludes with a discussion of open research questions and future directions in RL-based control of nonlinear systems. |
| format | Article |
| id | doaj-art-2ab233841a7a49bfa20861407dccf794 |
| institution | OA Journals |
| issn | 2772-5081 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | Elsevier |
| record_format | Article |
| series | Digital Chemical Engineering |
| spelling | doaj-art-2ab233841a7a49bfa20861407dccf7942025-08-20T01:55:11ZengElsevierDigital Chemical Engineering2772-50812025-06-011510023110.1016/j.dche.2025.100231A tutorial review of policy iteration methods in reinforcement learning for nonlinear optimal controlYujia Wang0Xinji Zhu1Zhe Wu2Department of Chemical and Biomolecular Engineering, National University of Singapore, 117585, Singapore; School of Automation Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, Sichuan, ChinaDepartment of Chemical and Biomolecular Engineering, National University of Singapore, 117585, SingaporeDepartment of Chemical and Biomolecular Engineering, National University of Singapore, 117585, Singapore; Corresponding author.Reinforcement learning (RL) has been a powerful framework for designing optimal controllers for nonlinear systems. This tutorial review provides a comprehensive exploration of RL techniques, with a particular focus on policy iteration methods for the development of optimal controllers. We discuss key theoretical aspects, including closed-loop stability and convergence analysis of learning algorithms. Additionally, the review addresses practical challenges encountered in real-world applications, such as the development of accurate process models, incorporating safety guarantees during learning, leveraging physics-informed machine learning and transfer learning techniques to overcome learning difficulties, managing model uncertainties, and enabling scalability through distributed RL. To demonstrate the effectiveness of these approaches, a simulation example of a chemical reactor is presented, with open-source code made available on GitHub. The review concludes with a discussion of open research questions and future directions in RL-based control of nonlinear systems.http://www.sciencedirect.com/science/article/pii/S2772508125000158Reinforcement learningMachine learningOptimal controlPolicy iterationNonlinear systemsChemical process control |
| spellingShingle | Yujia Wang Xinji Zhu Zhe Wu A tutorial review of policy iteration methods in reinforcement learning for nonlinear optimal control Digital Chemical Engineering Reinforcement learning Machine learning Optimal control Policy iteration Nonlinear systems Chemical process control |
| title | A tutorial review of policy iteration methods in reinforcement learning for nonlinear optimal control |
| title_full | A tutorial review of policy iteration methods in reinforcement learning for nonlinear optimal control |
| title_fullStr | A tutorial review of policy iteration methods in reinforcement learning for nonlinear optimal control |
| title_full_unstemmed | A tutorial review of policy iteration methods in reinforcement learning for nonlinear optimal control |
| title_short | A tutorial review of policy iteration methods in reinforcement learning for nonlinear optimal control |
| title_sort | tutorial review of policy iteration methods in reinforcement learning for nonlinear optimal control |
| topic | Reinforcement learning Machine learning Optimal control Policy iteration Nonlinear systems Chemical process control |
| url | http://www.sciencedirect.com/science/article/pii/S2772508125000158 |
| work_keys_str_mv | AT yujiawang atutorialreviewofpolicyiterationmethodsinreinforcementlearningfornonlinearoptimalcontrol AT xinjizhu atutorialreviewofpolicyiterationmethodsinreinforcementlearningfornonlinearoptimalcontrol AT zhewu atutorialreviewofpolicyiterationmethodsinreinforcementlearningfornonlinearoptimalcontrol AT yujiawang tutorialreviewofpolicyiterationmethodsinreinforcementlearningfornonlinearoptimalcontrol AT xinjizhu tutorialreviewofpolicyiterationmethodsinreinforcementlearningfornonlinearoptimalcontrol AT zhewu tutorialreviewofpolicyiterationmethodsinreinforcementlearningfornonlinearoptimalcontrol |