Formal Verification of Spatio-Temporal Rules Guided Safe Reinforcement Learning for CPS
Deep reinforcement learning is currently a commonly used method in decision-making for cyber physical system (CPS). However, when facing an unknown environment and dealing with complex tasks, deep reinforcement learning based on black boxes cannot guarantee the security of the system and the interpr...
Saved in:
| Main Author: | |
|---|---|
| Format: | Article |
| Language: | zho |
| Published: |
Journal of Computer Engineering and Applications Beijing Co., Ltd., Science Press
2025-02-01
|
| Series: | Jisuanji kexue yu tansuo |
| Subjects: | |
| Online Access: | http://fcst.ceaj.org/fileup/1673-9418/PDF/2312010.pdf |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Deep reinforcement learning is currently a commonly used method in decision-making for cyber physical system (CPS). However, when facing an unknown environment and dealing with complex tasks, deep reinforcement learning based on black boxes cannot guarantee the security of the system and the interpretability of reward function settings. To address the above issues, a formalized spatio-temporal rule verification-guided safe reinforcement learning method is proposed. Firstly, the combination-space rule timed communicating sequential process (CSR-TCSP) is proposed to model the system. Then it is validated by failure divergence refinement (FDR) which is a model checker combined with the spatio-temporal specification language (STSL). Secondly, the structure of the reward state machine is formalized by abstracting the system environment model to propose the spatio-temporal rule reward machine (STR-RM) which can guide the setting of reward functions in reinforcement learning. In addition, to monitor system operation and ensure the safety of output decisions, a monitor and a safe action decision-making algorithm are designed to obtain a more secure state-action strategy. Finally, the effectiveness of the proposed method is demonstrated through an example of obstacle avoidance and lane-changing overtaking in the autonomous driving system. |
|---|---|
| ISSN: | 1673-9418 |